Voice recognition preprocessing voice signal / Yeoh Ling Tze by Yeoh, Ling Tze
FACULTY OF COMPlJTER SCIENCE AND INFORMA 1]0N TECHNOLOGY 




Preprocessing Voice Signal 
By 
YEOH LING TZE 
WEK010314 
Supervisor 
Mr. Yamani Idna b. Idris 
Moderator 
Mr, Amirrudin Ilj Karnsin 
Submission Date 










Voice Recognition Abstract 
Abstract 
Voice recognition is the process by which a computer is able lo map an acoustic 
signal (i.e. spoken words) to the textual representation of those words. Voice recognition 
is also known as Speech Recognition, Speech Understanding System or Automatic 
Speech Recognition (ASR). Voice recognition techniques have been in the existence for 
approximately 30 years and many are currently achieving an accuracy rate of 
approximately 99 percent for isolated word recognition. Voice recognition work now run 
under the gamut from practical 24 hour a day applications of isolated word recognizers in 
industrial or governmental operations, to research and development work in versatile 
recognizers of complex spoken sentences. The number of commercial devices being sold 
is apparently expanding exponentially, public awareness is rising and the prospects for 
future impact on human interaction with machines arc very bright. Voice recognition is a 
part of a broader speech processing technology that also involve in computer 
identification and verification of speakers, computer synthesis of speech and production 
of stored spoken responses, computer analysis of the physical and psychological state of 
speaker, efficient transmission of spoken conversations, detection of speech pathologies 
and aids to the handicapped. Isolated word recognition has many applications in tlrc 
areas of voice commanded machine operation, voice dialing, command and control 









Voice Recognition Acknowledgement 
•'·'·'•'·'·'·'·'·'·'·'•'·'············································ 
Acknowledgement 
I would like to thank the many friends who have helped me in my frequent 
struggles with various aspects of the subject of voice recognition. First and foremost, 
warm and grateful thanks go to Mr. Yamani Idna b. Idris as my supervisor for my final 
year project Voice Recognition. He has overcome my initial hesitations with his 
enthusiasm and he has nurtured my understanding with his amazing generosity and 
patience. I am very appreciating of the advices and encouragement given by him during 
the development of this project. And also thank to my moderator Encik Arnirrudin Hj 
Kamsin for giving me advices and ideas to enhance my project. He has also assisted my 
development in innumerable other ways. These two have provided support in times of 
crisis and criticism in times of confidence. 
Many students have worked with me on aspect of voice communication. r would 
like to mention in particular Don Kong, Lee Kuan, and Vincent Many of the ideas stern 
from fascination conversation with my project partner, Tc1.: Boon Siong, I le provides a 
great deal of stimulation for my work. Jam heavily obliged lo my family for donating 
that most precious of au gifts - time - to help improve my viva presentation. r have been 
very fortunate in obtaining support of my voice recognition research from the main 











Voice Recognition Table of Contents 
'·'·'·'·'•'•'·'·'·'·'·'•'•'•'·'·'·'·'•'•'•'·'·'·'·'·'·'•'·'·'·'·'•'• '•'•'·'·'·'·'·'·'·'·'·'·'·'•'•'•'•'·'·'·'·'·'·'·'•'•'·'·'· ··············'·'·'•'•'·' 
Table of Contents Pages 
Abstract 
Acknowledgement 
Table of Contents 
List of Figures 






1.1 Project Introduction 
1.2 Objectives 
1.3 Scope 
1.4 Chapter to Chapter OutJine 







2.0 Literature Review 
2.1 Definition of Voice Recognition 
2.2 Types of Voice Recognition 
2.3 Uses and Applications 
2.4 Review of Existing Design 
2.4. l Features 
2.4.2 Block Diagram of RSC- 4x 
2.4.3 Algorithm used in RSC-4x 
2.4.4 RSC-4x Architect ure 
2.4.5 How .RSC-4 Works in Recognizing Voice 
2.5 Review of Existing Technique Approach 
2.5.J Template Matching 
2.5.2 HMM-n~1sed 
2. 5.3 Word Trigram Models 
2.5.4 Neural Network approach 
2.5.5 Acoustic Phonetic Analysis 
2.6VHDL 
2.6. l Primary Design Unit Model Structures 
2.6.2 VHDL PACKAGES 
2.6.3 Design Units and Libraries 
2.6.4 Advantages of using VfIDL 
2.7 FPGA 
2.8 Analog-lo-Digital Converter 

























3.0 Project Methodology 
3.1 Introduction 
3.2 Design J\rfolhod 

















Table of Contents 
3.3.2 Brainstorming 
3.3.3 Library 
3 .3. 4 Internet 
3.4 Development Tool" 
3.4.l XStend Board Vl.3.2 and XS 100 Board 







4.0 Proposed Design 
4.1 Introduction 
4.2 Inputting Voice Signals through the Xstcnd Board 
V l.3.2 Codec and outputting it to the LED 
4.3 Voice Signal Preprocessing 
4.3. l Speech Digitization 
4.4 FPGA ( Field Programmable Gate Array ) Circuitry 
4.4. l Top level design of the system 
4.4.2 Operations in each module 
4.4.3 Overrun Error 
4. 5 Shift Register 
4.6 Voice recognition algorithm 
4. 7 20- bit Register 














5.0 System Implementation 
5. l Introduction 
5.2 XIl~INX WebPACK ISE 
5.3 VHDL and Preprocessing Voice Signal 
5.4 Implcmentation Steps 
5.5 Modules Pin Desc1iption 
5. 5.1 Top level pin description of Preprocessing 
Voice Signal 
5.5.2 Codec Interface module Pin Description 
5.5.3 Channel module Pin Description 
5.5.4 Clock Generator module Pin Description 
5.5.5 lock Divider module Pin Description 
5.5.6 LED Decoder module Pin Description 
5.6 Writing VHDT, code 















6.0 System T(:'sting 8 l 
6.1 hltroduction 81 
6.2 Design Simulation 81 
6.3 Clock Generator module Simulation 81 
6.4 ""hannel module Simulation 83 
G.4. 1 Receive tlata from codec ADC 83 
6.4.2 Handle reading of ADC data from codec interface 8G 










Voice Recognition Table of Contents 
·········································· ·············································'·'·'·'·'·'·'·'·'•'·'·'·' 
6.5 Codec Interface module Simulation 89 
6.5.l uOProcess 89 
6.5.2 uIeft and u_right Process 89 
6.5.3 Overrun process 90 
6.6 Clock Divider module Simulation 90 
6.7 Led module Simulation 91 
7.0 System Evaluation 92 
7 .1 Introduction 92 
7.2 Problems Encountered and Solutions 92 
7.3 System Strength 93 
7.4 System Constraints 94 
7.5 Future Enhancements 94 




List of Figures 
Figure 1.1 : Project Schedule 4 
Figure 2.1 : The block diagram of the RSC-4x. J 2 
Figure 2.2 : RSC-4x Internal Block Diagram 14 
Figure 2.3 : Phonetic Categories for a typical Feature Analysis system 20 
Figure 2.4 : Sound Category Sequence for yes, no, begin and stop. 20 
Figure 2. 5 : Field Programmable Gate Arrays (FPG A) 29 
Figure 3. 1 : Simplified top-down design methodology 34 
Figure 3.2: Xstend Board Layout 36 
Figure 3.3 : XS 100 Board Layout 37 
Figure 3.4 : XStend Board view 38 
Figure 3.5 : Xilinx WcbPack 4.2 41 
Figure 3.6 : Modelxim XE Simulator 42 
Figure 3. 7 : XS Tool. 43 
Figure 4.1 : The general view of 1J1e voice recognition process 44 
Figure 4.2 : Connections between XStend Codec chip and fl>GA 46 
Figure 4.3 : The block diagram of the entire process 4 7 
Figure 4.4: Undersampled signal spectrum 49 
Figure 4.5: Oversampled signal spectrum 50 
Figure 4.6: A simplified view of fl .. PD circuitry 51 
figure 4. 7 : The top level design of the Voice Recognition system 52 
Figure 4.8 : Block diagram of the reading operation 55 
figure 4.9 : Normal Flow 57 
Figure 4.10 : Register with overrun error 58 
Figure 4.1 l : Lo ric diagram of an 8 bit serial-in parallel-out Hhifl register 59 










Voice Recognition Table of Contents 
······················································· ····································· ··········································································· 
Figure 4.13: High Level Diagram of the LED Decoder 62 
Figure 4.14 : LED Decoder Flow Chart 64 
Figure 5.1: Xilinx's WebPACK ISE interface on 'File' menu bar options 68 
Figure 5.2: Xilinx's WebPACK ISE interface with the editing Window 68 
Figure 5.3: An example of the Model Sim XE Transcript Window. 69 
Figure 5.4: Top level pin description of Preprocessing Voice Signal system 72 
Figure 5.5 : High Level Diagram of the LED Decoder 78 
Figure 5.6: A portion of Channel.vhd test bench module 81 
Figure 6.1: Timing diagram of the main dock used in CLKGEN.VHD 82 
Figure 6.2: Simulation result of CLKGEN. VIlD module: 83 
Figure 6.3: Simulation result ofTSTCHANNEL_RCVADC.VHD 85 
Figure 6.4: Simulation result of TSTCHANNEL_RCV ADCLVTJD 86 
Figure 6.5: Simulation result of TSTCHANNEL_READOUT.VfTI) 87 
Figure 6.6: Simulation result of TSTCHANNEL __ READOUTJ .VI-ID 87 
Figure 6.7: Simulation result ofTSTCHANNEL_OVERRUN.VIID 88 
Figure 6.8: Simulation result ofTSTCIIANNEL_OVER.RUNl.VI-ID 88 
Figure 6.9: Simulation result of TSTCODEC __ fNTFC.VHD 90 
Figure 6.10: Simulation result of TSTCLOCK_DIVIDER.VI-ID 91 
Figure 6.11: Simulation result ofTSTLED.VHD 91 
List of Tables 
Table 2.1 : Category of voice recognition parameters with its ranges 8 
Table 2. 2 : Channel Selection 31 
Table 3.1 : Functions of the resources on XStcnd Board and XS 100 Board. 32 
Table 4.1 : Number display by the LED segments 63 
Table 5.1: Codec interface module pins description 7 4 
Table 5.2: Channel module pins description 76 
Table 5.3: Clock Generator module Pin Description 77 










Voice Recognition Introduction 
········································································· ··················································'·'·'·' ·'·'·'·'·'·'•'·'•'•'•'·'•'•'•'·'•'·'•'·'···························'·'·'·'·'·'·'·'·'·'·' ·'•'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·'·'·'·'·'·'·' 
Chapter 1: Introduction 
1.1 Project Introduction 
Voice recognition technology has been in research and development for more 
than three decades. Voice recognition is a process transcribing speech into text 
automatically. Now this technology is close lo becoming more practical. In other words 
rather than keyboard interface, all computers are equipped with a voice interface. In such 
a world, the product inspector can enter data verbally, Jcaving his or her hands free to 
perform production operations. For this reason, a large amount of research and 
development is presently taking place in the field of voice recognition and understanding. 
Voice interfaces will gradually replace traditional keyboards. It will evolve with other 
high-technology fields, in particular the field of artificial intelligence. 
1.2 Objectives 
Below are the objectives of developing this project :- 
a) Pre processing the voice signal 
b) Develop a simple and yet completes voice recognition algorithm using zero 
crossing. 
c) Replace the remote control with a microphone. 
d) Help the disabled or the people who suffer attacks from diseases such as arthritis 








Voice Recognition introduction 
''''·'·'·'•'•'·'·'·'·'·'•'•'·'· '•''·'·'·'·'•'·'·'·''·'•'•'•'·'·'· ''''•'·'·'·'-'·'•'·' 
1.3 Scope 
The scope of the system will determine the range and how the system will work. 
The main scope of the project is to design a voice recognition system in terms of pre 
processing the voice signal and then implementing a best voice recognition algorithm and 
at the same time generating VHDL source code. Individual modules will ht: defined in 
terms of its functions and interconnections between them. The first phase of the project is 
about learning the fundamentals of voice recognition. During this phase, VHDL is also 
learned. 
1.4 Chapter to Chapter Outline 
In developing this system, a lot of stages have been gone through starting from drafting 
planning until the system is fully developed. This report will include every step that is 
taken into considerations while developing a voice recognition system that is divided to 
each different chapter outline. 
Chapter 1 will bring us to the introduction of the world of voice recognition that is also 
known as speech recognition. The objectives and scopes of developing this system will 
be provided here. Chapter to chapter outline and project schedule are also provided in this 
chapter. 
Chapter 2 will introduce us to the literature review of the system that will include the 










Voice Recognition Introduction 
This chapter will also touch a little bit about VHDL as a hardware description language 
and FPGA. 
Chapter 3 cover the methodology used in developing the system. Project development 
techniques will be mentioned here, as well as the development tools used. 
Chapter 4 is where the details of proposed design will be mentioned, The process of pre 
processing the voice signal starting from the input until the expected output will be 









Voice Recognition Literature Review 
Chapter 2 : Literature Review 
2.1 Definition of Voice Recognition 
Voice Recognition is defined as the ability of a machine to recogni zc and understand 
spoken words. In details, voice recognition is the process of converting an acoustic 
signal, captured by the microphone to a set of words. The recognized words can he 'the 
final result, as for application such as command and controls, data entry and document 
writing. 
The following definitions are the basics needed for understanding voice recognition 
technology. 
• Utterance 
An utterance is the vocalization (speaking) of a word or words that represent a single 
meaning to the computer. Utterances can be a single word, a few words, a sentence, or 
even multiple sentences. 
• Accuracy 
The ability of a recognizer can be examined by measuring its accuracy - or how well it 
recognizes utterances. This includes not only correctly identifying an utterance but also 
identifying if the spoken utterance is not in its vocabulary. Good systems have an 
accuracy of 98<X) or more. The acceptable accuracy of a system really depends on the 
application. 
• Speaker Dependence/Independence 
The first criterion is to consider who is going to use the system. Thu reason for this is H1<1t 









Voice Recognition Literature Review 
speaker-dependent system must be trained by the user. A training session will be 
required as to speak each word to be recognized into the system. In most cases, each 
word must be repeated at least five ties for the training session to be successful. The 
system must be retrained by different users and it is obviously time consuming. However, 
for many application a speaker-dependent system is the most. practical and economical 
choice, since it is very accurate in recognizing a given user's voice. Meanwhile, a 
speaker-independent system does not require training and will respond to the voice input 
of any user. However, the trade-off is usually smaller vocabulary and increased software 
complexity. [Andrew, 1987] 
• Vocabulary 
Only a few vocabularies will be used in the recognizing process. The example: zero, one, 
and two. Many industrial and consumer application exist that could utilize this simple 
vocabulary. With today's technology, good voice recognition is directly related to small 
vocabularies. As the system vocabulary increases, the system memory complexity, 
response time and cost, all increase in direct proportion. Generally, smaller vocabularies 
are easier for a computer to recognize, while larger vocabularies arc more difficult. [ 
Andrew, 1987 l 
2.2 Types of Voice Recognition 
Voice recognition systems can he separated in several different classes by describing 
what types of utterances they have the ability to recognize. These classes are based on the 









Voice Recognition Literature Review 
·'·'·'·'·'·'•'·'·'·'•'-'•'·'•'•'·'•'•'·'•'•'•'•'·'•'·'·'·'·'·'·'·'•'·'-'•'·'·'·'•'·'·'•'•'·'·'·'·'•'·'· '•'•'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'•'•'·'·'·'·'·'·'·'•'•'•'• '·'·''•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'· '•'•'·'·'·'·''•'•'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'·' 
finishes an utterance. Most packages can fit into more than one class, depending on 
which mode they're using. 
• Isolated Word versus Connected Speech 
Voice recognition fall into one of two functional categories; isolated-word recognition or 
connected-speech understanding systems. As the name implies, isolated-word recognition 
systems are designed to recognize or verify a single-word utterances. The technology for 
these systems is well developed and as a result, many commercial products employ 
isolated-word recognition. 
A speech signal that contains whole phrases and sentences is called connected or 
continuous, speech. The speech signal is therefore broken down in to individual word 
signals. First, it is difficult to determine the precise boundaries of an individual word 
within a connected speech signal. Speaker must make a 200-300 ms pause between 
words. Second, the signal for a given word in connected phrase does not closely resemble 
the signal for the same word if spoken separately. Finally, the pronunciation of a. given 
word is affected by the words and punctuation around it. [ Andrew, 1987 ]All these 
problems make connected-speech recognition a difficult task indeed. 
• Voice Recognition versus Speech Understanding 
Voice recognition is usually associated with the signal-matching or template, techniques 
used in isolated-word recognition. Speech understanding on the other hand, is usually 










Voice Recognition Literature Review 
using knowledge about the speech, rather than simply matching patterns as in voice 
recognition systems. 
• Category of Voice Recognition 
Table 2.1: Category of voice recognition parameters with its ranges 
Parameters Range 
Speaking I\1ode Isolated words to continuous speech 
Speaking Style Read speech to spontaneous speech 
Enrollment Speaker-dependent to Speaker-independent 
Vocabulary Small (Jess than 20 words) to Large (more 
than 20 words) 
Language Model Finite-state to Context-sensitive 
2.3 Uses and Applications 
Although any task that involves interfacing with a computer can potentially use this 
system, the following applications are the most common right now. 
Dictation 
Dictation is the most common use for a voice recognition system today. This 
includes medical transcriptions legal and business dictation, as well as general 
word processing. In some cas~s special vocabularies are used lo increase the 









Voice Recognition Literature Review 
''·'·'·'·'·'•'•'•'"·'•'•'•'·'•'•'•'··················································································································· '•'·'·'·'·'·'·'·'•'·'··············· '·'·'·'·'·'·'•'•'•'•'•'•'·'·'·'·'·'·'·'·'·'•'•'·'·'·'·'·'·'·'·'·'·'·'•'•'·'·' 
Command and Control 
Voice recognition systems that are designed to perform functions and actions on 
the system are defmed as Command and Control systems. Utterances like "Open 
Netscape" and "Start" will do just that. 
Telephony 
Some systems allow callers to speak commands instead of pressing buttons to 
send specific tones. 
Wearable 
Because inputs are limited for wearable devices, speaking is a natural possibility. 
Medtcal/Disabllitles 
Many people have difficulty typing due to physical limitations such as repetitive 
strain injuries (RSI), muscular dystrophy, and many others. -or example, people 
with difficulty hearing could use a system connected to their telephone to convert 
the caller's voice to text. 
Embedded Applications 
Some newer cellular phones include voice recognition that allows utterances such 
as "Call Home". 
2.4 Review or Existing Design 
There are quite a number of oice recognition designs on the market that off er arh,ty of 









Voice Recognition Literature Review 
1. OR 4 and ST Micro Electronics' Euterpe TM Digital Voice Processor are a few examples 
of voice recognition chip design that we can get from the market. 
The RSC-4x is a voice/speech recognition and analog VO mixed. signal processor 
developed by Sensory Inc. Based on an 8-bit micro controller, the RSC'- 4, integrate 
speech-optimized digital and analog processing blocks into a single chip solution capable 
of accurate speech recognition that may produce a high quality, low date-rate compressed 
speech and advanced music too. 
2.4.1 Features 
• Full Range of Sensory Speech™ 7 Capabilities 
a) Enhanced word spotting capability (10 speaker independent or 5 speaker 
dependent words) in parallel 
b) Noise robust speaker independent, dependent & continuous listening 
recognition 
c) High quality, 3.7-7.8 kbps speech synthesis & sound effects 
d) Speaker verification ( SV) - voice biometric security 
e) 8 voice MIDI-compatible music synthesis coincident with speech; drum track 
feature enables additional voices 
f) Voice record & playback 











································································ ········································································ ························ ······································· ···············'·'·'·'·'·'•'''······················ 
Literature Review 
• Integrated Single-Chip Solution 
a) 8-bit microcontroller 
b) ROivlless, 128KByte and 256KByte ROM options 
c) 16 bit ADC, 10 bit DAC and microphone pre-amplifier 
d) Independent, programmable Digital Filter engine 
e) 4.8 KBytes total RAi\,1 (256Bytes "user" application RAM) 
f) Five timers (3 GP, 1 Watchdog, l Multi Tasking) 
g) Twin-DMA, Vector Math accelerator, and Multiplier 
h) Built-in Analog Comparator Unit (4 inputs) 
i) External memory bus: 20-bit Address (Uvlbyte), 8-bit Data 
j) On chip storage for SD, SV, templates (10 templates) 
k) Code security through no ROM dump capability 
1) Uses low cost 3.58MHz crystal (internal PLL) 
m) Low EJvll design for 1~ 'C and CE requirements 
n) 24 configurable l/O lines with 10 mA (typical) outputs 
o) Fully nested interrupt structure with up to a sources 









Voice Recognition Literature Review 
2.4.2 Block Diagram of RSC- 4x 
n w IOI ~ 5e '•1'1? •:t H ld~1tl I or f4"1tontJ I >r 
1)1 >''.~cHIO~ 
7t<"Hi' 
RSC 1r1.r11;,1 'l~·1 ''~~o 
·Ct~ : :ill'~!' ~t ~-!.fl}:!, 
l ::·~·,· '8 j: '!~~; ~·i' 
:: ~~.-:c.~::;·.r. 
Figure 2.1: The block diagram of the RSC-4x. 
The RSC-4x features an eight-bit micro controller with on-chip ADC, DA preamplifier ) . ) 
RAM, ROM and optimized audio processing blocks. The CPU core embedded in the 
RSC-4x is an 8-bit, variable-length-instruction micro controller. The instruction set is 
similar to the 8051 micro controller, and has a variety of addressing mode, Jv!O V and 16 
bit instructions. The RSC- 4x processor avoids the limitations of dedicat cd A, B, and 










Voice Recognition Literature Review 
·'·'·'·'·'·'······················································································· ····································•················································ ·········'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'•'•'•'•'·'·'·'·'·'·'·'·'·'·'•'·'·'·' 
2.4.3 Algorithm used in RSC-4x 
The RSC-4x support both Hidden Markov Modeling ( HMM) and neural nctworl 
technologies which are also provided in this report aflcr this. Both approaches are able lo 
perform speaker independent (SI) speech recognition. Spcal er independent recognition 
requires on-chip or off-chip ROM to store the words to he recognized. Speaker dependent 
(SD) recognition requires programmable memory to store personalized speech templates. 
This programmable memory may be on-chip SRAM or off-chip Serial EEPROM, Flash 















i.,,;.c,I(: ;...;:r·: •,"iA<EiJF· ·~ 
:•1('•7.•L 
fH ... TE.~. < > 
"WI .. 
. ...,,..,,,..,.....,.,.,.,~-·-!r-~~- ~c:=JI '~ ~. ' 
·:.·.,' ' - .. ' 
I '(. .'. 
~.-: \': :-·CTi 
1::_~7 _:.:;1,_: 










Voice Recognition Literature Review 
•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'·'·'·'································································ '•'•'•'•'•'•'•'·'·'·'·'·'·'·'·'·''• '•'·'•'•'•'·'•'•'·'·'·'·'·'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·' ·'•'·'·'·'·'·'·'·'•'·'·'·'·'····· ························ 
The RSC-4x combines: 
a) 8-bit micro controller with instructions and interrupt control, register architecture 
independent Digital Filter engine and "Ll " Vector Math Accelerator 
b). On-chip ROM and RAM (4.8 Kbytes), and the ability to address off-chip RAM, 
RO:rvI, EPROM or Flash. 
c) Input microphone preamp and 16 bit Analog-to-Digital Converter (ADC) for 
speech and audio/analog input· 
d) 10 bit Digital-to-Analog Converter (DAC), and 10 bit Pulse Width Modulator 
(PWM) to directly drive a speaker or other analog device 
e) Low power Audio Wakeup from power down mode, when a selected audio event, 
such as clap or whistle, occurs 
The RSC-4x has 20-bit address and S-bit data busses for interfacing with external 
memory. Members of the RSC-4x family with internal R M contain an -XM input pin 
capable of enabling or disabling the internal ROM. There are three bi-directional ports 
provide 24 configurable, general-purpose I/O pins to communicate with or control 
external devices with a variety of source and sink currents. Up to 4 of these I/O may be 
used as programmable Analog Comparator inputs. 16 may be used as I/ wal cup. 
The processor clock can be selected from either source, with a selectable divider value. 
There are three programmable general-purpose 8-bit counters/ timers. There is also a 
Watchdog timer that may be used to exit an undesired condition in program flow, and 









Voice Recognition Literature Review 
·····························································································································'·'·'·'·'·'·'·'·'·'·'·'· ·'·'·'·'·'·'·'·'·'·'·'·'·'· '·'·'·'·'•'•'•'·'·'·'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·'·'·'•'·'·'·'·'•'·'·'·'·'·'•'·'•' 
2.4.5 How RSC-4 Works in Recognizing Voice 
An external microphone passes an audio signal to the preamplifier and ADC to convert 
the incoming speech signal into digital data. Speech features are extracted using the 
Digital Filter engine. The micro controller CPU processes these speech features using 
speech recognition algorithms in firmware, with the help of the "I, 1" V cctor Accelerator 
and instruction set. The resulting speech recognition results may be used to control the 
consumer product application code, or to output speech or audio in the form of a dialog 
with the user of the consumer product. If desired, the output speech or audio signal from 
the RSC-4x is generated by a DAC for external amplification into a speaker, or a PWM 
capable of directly driving a speaker at typical consumer product volumes. 
2.5 Review of Existing Technique Approach 
There are many techniques have been implemented by the voice recognition system 
developer to extract the features and characteristic of the speech and convert it into 
understandable words. Each and every technique requires a lot of mathematical 
calculations that involve specific algorithm. 
2.5.1 Template Matching 
Template matching is one of the proven voice recognition techniques and has 
resulted in many low-cost commercial systems. This system must be trained by 
the system user. The associated software performs two functions during its 
training mode: acoustic anal rsis and template generation, In the rcco mition 










Voice Recognition Literature Review 
························································································ ·································································· '•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'•'·'·'·'·'·'·'·'·'·' ·'·'·'·'•'•'·'·'·'·'·'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·' 
mode. The coded (linear predictive coding, LPC) data are then temporarily stored 
in matrix form for comparison to the reference templates that were generated 
during the training mode. A mathematical algorithm called dynamic programming 
can be used to compare templates. 
Dynamic programming and a related technique called time warping arc used to 
reduce recognition errors clue to improper time alignment. [ Andrew, 1987] 
Dynamic programming is a pattern-matching algorithm that is used for both voice 
recognition and visual pattern recognition. It is a matrix analysis technique that 
computes all the possible combinations of time alignments between the reference 
and unknown templates, the result being the best match between the two 
templates. 
2.5.2 HMM-Based 
In I-IlvfM(Hidden Markov Model) based voice recognition, estimation of 
parameters of HMM:s is viewed as counterpart of training or learning in traditional 
sequential pattern recognition since speech signal can be represented by a 
sequence of z-dimension vectors after features are extracted from the speech 
signal. [Bahl, 1986] Voice samples with different duration contribute differently 
to estimation of parameters of the same HN1M. and as a result, HMM model some 
speech units well and others bad. Consequently, the confusion among speech 
units is unavoidable. The fact that on the whole, duration of phones varies 










Voice Recognition Literature Review 
not a perfect solution, although it is an available approach to the above problem 
because of somehow stochastic variation of duration of phones. Furthermore, 
while only smaller training set is accessible, for instance, in the case of speaker 
adaptation, the problem becomes very serious. 
2.5.3 Word Trigram Models 
Another widely used voice recognition technique is the algorithm using Word 
Trigram models. Word Trigram model is based on Viterbi search or known as 
one-pass DP. Among the many speech recognition algorithms, the Viterbi search 
is well suited for Hidden Markov Model (I-Ilv1Ms) used as language models. 
2.5.4 Neural Network approach 
The neural network is trained on a small amount of stereo speech data, composed 
of simultaneous close-talking and distant-tall ing samples. [Hirschberg, 1993 j 
This small amount of adaptation data is negligible compared to the hours of data 
required to retrain the voice recognizer for a specific environment. Performance 
measurements on continuous voice recognition show that the system is capable of 
elevating the recognition accuracy of the voice recognizers to an acceptable 











Voice Recognition Literature Review 
2.5.5 Acoustic Phonetic Analysis 
The acoustic phonetic analysis is a logical strategy in order to accomplish the uosl 
of voice recognition. This model is developed to handle large-vocabulary, 
speaker-independent, isolated-word recognition. With phonetic analysis, the 
speech sounds are divided into several broad phonetic categories. The template 
matching techniques are not as reliable as speaker-independent because of the 
signal variation from speaker to speaker. 
The idea behind acoustic phonetic analysis is to analyze the speech or voice signal 
and separate the phoneme sounds to produce a detailed phoneme representation of 
the utterance. From the phonetic representation, a memory lookup operation 
would produce the corresponding word. Broad phonetic classifications such as 
vowel, nasal, and the fricative are less sensitive to fine phonetic variances from 
speaker to speaker. [ Andrew, 1987 ') The idea is to interpret U1e spoken word into 
a series of broad phonetic categories. This information is then used to determine a 
small set of possible word candidates, from which a more detailed phonetic 
analysis produces the final word selection. Broad phonetic classification is a top- 
down reasoning strategy employed by fine phonetic analysis. 
A closer look al a typical feature-analysis system with all the sounds of speech/voice that 









Voice Recognition Literature Review 
··························································································································· ································· ············································································································ 
:•:•:•:•:•:•:•:•:•:•:•:•:•.'•.'•."•:•:•:•:•.'•."•:•:•:•:•:•:•:•-··:•:•:•."•."•."•."•;•:•:•:•:•:•:•."•:•:•:•:•t•l•:•,•:•:•:•;•:•:•:•:•:•."•.'•I•."•.'•." •t•:•:•.'•:•,'•."•I• '•."•I•."•:•.'• ."•."•:•:•:•.'•."•.'•:•."•;•:•:•:•:•.'•I•:•:•."•."•.'•:•:• I•:•:•,'•-'•; •:•I•:>:• ,'•l•I •,'•!•,'•I•!•.'• !•I•,'•,'•,'•,'•,'•,·~••'' 
Pure voiced vowel ( V ) 
Nasal ( N) 
Voiced fricative ( VF ) 
Unvoiced fricative ( UF ) 
Plosive ( P) 
Glide ( G) 
a, e, i, o, u, uh, aa, ee, er, uu, ar, aw 
m, n, ng 
z, zh, v, dh 
s, sh, 1: th 
b, cl, g, p, I, I, h 
r, w, I, y 
Figure 2.3: Phonetic Categories for a typical Feature Analysis system 
Now consider the words yes, no, begin, and stop. Using the six categories of Figure 2.3, 
the sounds of each word can be expressed as a sequence of sound categories as shown in 
Figure 2.4. But before that, the pronunciation of each word will be looked up from a 
dictionary. 




Begin big in 
Stop stop 




•l•.'•>.'•l•.'•.'•.'•:•:•1•:•,•,•,"•:•:•l•!•l•:···:·:•l•.'•.'•'•;•:•:•;•:·1•,.•,•'•.'•····1•.••'•.'• '•.•,'•'•l•l•••,•••1•,'•'i• , ••••• 
Figure 2.4: Sound atcgory Sequence for yes, no, begin and stop. 
Assume that the vocabulary of our speaker-independent system consi: ts solely or the four 
words. A speaker utters one of the words into 01e system and an analysis algorithm 










Voice Recognition Literature Review 
, •,', '• •, •, •, •,', ', •, '• <, '••,•,I,•,'•<,'• I,•,•,•,<,'·', 1, •,I, I,<,;,<, •o I, I,>,<,>,•,<,<,<,'•'•'•''>,'•', I, o, '' ,, •, •, '• '' 1, 1 o ;, h 1, ,, t, ., '' •, .,., •, •, t, ., >1 ., .,~, ~' ~- \ ., t' 
decision-making algorithm must be used to determine whi •h of the four words was 
actually spoken. 
2.6 VHDL 
VHDL is an acronym which stands for VHSIC Hardware Description Language. VHSIC 
is yet another acronym which stands for Very High Speed Integrated Circuits. VHDL 
describes hardware much the same way as schematics. VHDL is just another way of 
describing what outputs of a digital circuit are desired when it is given certain inputs. The 
critical difference between VHDL and these other languages are that it can be readily 
interpreted by software, enabling the computer to accomplish our design .. It is used 10 
describe digital hardware in an abstract (and therefore easily changeable) way. 
It is being used for documentation, verification, and synthesis of large digital designs. 
The same VHDL code can theoretically achieve all three of these goals, thus saving a lot 
of effort. VHDL is designed to fill a number of needs in the design process. Firstly, it 
allows description of the structure of a design, that is how it is decomposed into sub- 
designs) and how those sub-designs are interconnected. Secondly, it allows the 
specification of the function of designs using familiar programming language forms. 
Thirdly, as a result, it allows a design to be simulated before being manufactured, so that 
designers can quickly compare alternatives and test for correctness without the delay and 
expense of hardware prototypiu 1• I Peter, 1990 l 
VI [DI. "an he used lo take three different approaches to describing hardware. These three 
diffel'QJ1l appl'Ot1ch~li arc the structura], <loin Ilow, and behavioral methods f hardware 










Voice Recognition Literature Review 
2.6.1 Primary Design Unit Model Structures 
Each VI:IDL design unit comprises an "entity" deck ration and one or more 
"architectures". Each architecture defines a differ ant implementation or model of a given 
design unit. Tho entity definition defines the inputs 10, and outputs from the module, and 
any "generic" parameters used by the different implementations of the module. 
Entity Interface Specifications 
port( port definition list);-- input/output signal ports 
entity name is 
generic( generic list); -- optional generic list 
end name; 
Explanation: 
J) An in port can be read but not updated within the module, carrying 
information into the module, 
2) An out port. can be updated but not read within the module, carrying 
information out of the module. 
3) A buffer port likewi o carries in ormation out of a module, but can be both 
updated and road within the module. 
4) An lnout port is bidirectional and can be both read and updated, with multiple 










Voice Recognition Literature Review 
5) Generics allow static information to bu connnuni .atcd to a block from its 
environment for all architectures of n dcs~:n unit, These in' lude timing 
information (setup hold, delay tim "S), part sizes, and other parameters. 
• Architecture 
Once an entity has had its interface specified in an entity declaration, one or more 
implementations of the entity can be described in architecture bodies. Each architecture 
body can describe a different view of the entity. Architecture defines one particular 
implementation of a design unit, at some desired level of abstraction. 
architecture arch name of entity _name is 




Deel. orations include data types, constants, signals, files, components attributes 
subprograms, and other information to be used in the implementation description. 
I Dou 1Jas, 1998 I 
'oncurrent sin/em nts describe a d~Hign unit at om: or more levels of modeling 










Voice Recognition Literature Review 
• Behavioral Model: No structure or tl.)chnolo~y impli xi. Usually written in 
sequential, procedural style. 
• Dataflow Model: All data path shown, plus all c ntrol signals. 
• Structural Model: Interconnection of mnponents ..
• Signals 
Signals are used to connect sub modules in a design. Signals are declared via signal 
declaration statements or entity port defmitions, and may be of any data type. The 
declaration syntax is: 
signal sig_name: data jype I r=initial , value]; 
Ports of an object are treated exactly as signals within that object. 
2.6.2 VHDLPACKAGES 
A VHDL package contains subprograms, constant definitions, and/or type definition to 
be used throughout one or more design units. Each package comprises a "declaration 
section", in which the available (i.c. exportable) subprograms, constants, and types arc 
declared, and a "package body", in which the subprogram implern ntation are d fin d 
along with any internally-used constants and types. The declaration section represents the 
portion of the package that is "visible" to the user of that pac age. The actual 











Voice Recognition Lit rature Review 
·'•'•'•'·'·'·'·'·'•'•'•'·'·'·'·'·'·'•'•'•'·'·'·' •'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'·····-···· ·········•,•,•.•,•.•,>,•,•, 
• Package component and subprogram deelarattons 
... exported constant declarations 
package packagename is 
... exported type declarations 
... exported subprogram declarations 
end package , name; 
package body package __ name is 
... type definitions 
... subprograms 
end package , name 
+ Package declaration, which defines its interface, 
• Package body, which defines the deferred details. The body part ma he omitted 
if there are no deferred details. 
• VHDL Standard J ackages 
a) STANDAlU) - basic type declarations (always visible by default) 











Voice Recognition Lit rature Review 
•'•'·'·'·'·'·'·'·'·'·'·'·'·'•'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'•'·'·'•'·'·'·'•'•'·'•'·'·'•"•'·'•'•'•'•'•'·'·'·'·"·'·''''''''''"·'· •,•1h•,•, •,' 
• IEEE Standard 1164 Package 
This package contained in the 'ioce' library supports multi-valu sd logic signals with type 
declarations and functions. 
library leee; -- VHDL Library stmt 
2.6.3 Design Units and Libraries 
A number ofVHDL constructs may be separately analyzed for inclusion in a design 
library. These constructs are called library units. The primary library units are entity 
declarations, package declarations and configuration declarations. A design file may 
contain a number of library units. The structure of a design file can be specified by the 
syntax: 
design file ::= designunit { designunit } 
design unit ::= contextclause libraryunit 
context clause :: -= { context jtem } 
context _item : := library_ clause I use_ clause 
library clause ::=library logical jiarne jist ; 
logical jiame list ::= logical name { , logical _name } 
library unit :: - primaryunit I secondary unit 
primary_ unit : :- 
entity declaration I conf uration declaration I package __ de laration 










Voice Recognition Literature Review 
•'•'•'•'•'•'•'•'•'·'·'·'·'·'•'•'•'·'·'•'•'·'·' ·'·'·'·'·'······················································· ··,•··········,·····'·' 
Libraries are referred to using identifiers called lo•1i 'al names. This name must be 
translated by the host operating system into ml implcm~ntn1ion d pendent storage name. 
Library units in a given library can ho referred lo by pt :fi ·ing their name with the library 
logical name. So for example, nl_lib.'tU_l 0 would 1 efer to the unit ttl_lO in library ttl_lib. 
2.6.4 Advantages of using VHDL 
There are many reasons why it makes good design sense to use VHDL: 
• Portability: Technology changes so quickly in the digital industry that discrete 
digital devices require constant rework in order to remain current. VHDL is 
designed to be device-independent, meaning that if we describe our circuit. iu 
VHDL, as opposed to designing it with discrete devices, changing hnnh are 
becomes a (relatively) trivial process 
• U lexibility. Changes of design specification can't be helped. Design v ork is 
usually focused on creating small, easily maintainable components and then 
integrating these components into a larger device. On larger projects different 
teams of engineers will each design separate parts of the project at the same lime. 
This can mean that if one component in the project changes, all of the compon .nt' 
must change, even those bcin , worked on by other engineering teams. But, if our 
design is using VHDL, all we have to do is change our code and e do not have 
to start over from the scratch. 
• Slundnnl: Mnny lrardware description languages are developed to s .rve the 









Voice Recognition Lit rature Review 
•'•'•'•'•'•'•'•'•'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·'· '·'·'·'·'•'·'·'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·'·'·'·'''''''''•'·'''·'•'•''-.,., , , , , 
methodology. V:fIDL is independent of technology, thus t sduc ·s confusion in 
making the interfaces between tools, 'omµani ·s and 111 du 't easier. 
• Cost: VHDL make the most reliable design pm ess, with minimum cost and time. 
• Productivity: VJ-D)L can increase productivity by shorten the time to market. 
Behavioral simulation can reduce design time by allowing design problems to be 
detected early on, avoiding the need to rework designs at gate level 
• Better design: Behavioral simulation permits design optimization by exploring 
alternative architectures, resulting in better designs. 
• Reusability: System may be used again in other instances for which it may or 
may not have been specifically intended. To move a design to a new technology n 
specification needs not to start from scratch or reverse-engineer, Instead, the 
design tree to a behavioral VHDL description can be implemented in the new 
technology knowing that the correct functionality preserved. 
2.7 FPGA 
A field-programmable gate array (FPGA) is an integrated circuit (I ) that can be 
programmed in the field after manufacture. l•PG/\s arc used by engineer in the de ign of 
specialized ICs that can later be produced hard-wired in large quantities for distribution 
to computer manufacturers and end users. 
FP )/\ contain hundreds (or thousands) of Configurable Logic Block ( LB). One LB i 
a rectangular area on the chip that conlains a lookup table (J JJf), a Ilip-flop and routin ..... 










•'·'·'·'·'•'•'•'·'·' •'•'•''•'•'•'•'•'•'•'•'•' ···········'•'·'·'·'·'• '·'•'·'•'·'·'·'•'·'•'·'·'•'•'•'·'·'·'•'•'•'•'······''•'•'•'•'''·•,.,,,., ... , ... ,.,.,..,, 
Literature Review 
flip-flop allows synchronization (based on a clock si ~11 ll) and th' routing is just a lot of 
interconnection wiring between the CT ,B whi 'l' 'On be linked tog ther to form complex 
logic implementations, 
Array oflogic blocks is surrounded by programmable I/O blocks and connected with 
programmable interconnection .Most FPGAs do not provide 100% interconnection 
between their logic blocks because it would he prohibitively expensive. Instead, 
sophisticated software places and routes the logic on the device almost the same as a 
Printed Circuit Board (PCB) auto router would place and route components, 










[t ii [t [1 [1 ii ii [1 I 










Voice Recognition Literature Review 
'•'•'•'·'·'·'·'·'·'•'•'•'•'•'·'·'•'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·'·'·'·'·'·'•'·'·'·'•'·'··· ··················· ··········'''·'·'·'•'•'' .... 
2.8 Analog-to-Digital Converter 
Analog-to-digital conversion is an electronic process in x 'hi \ha continuously variable 
(analog) signal is changed, without alterinn its essential ontent, into a multi-level 
(digital) signal, 
ADC is used to convert an analog input voltage into a digital output. The input to an 
analog-to-digital converter (ADC) consists of a voltage that varies among a theoretically 
infinite number of values. Examples are sine waves, the waveforms representing human 
speech, and the signals from a conventional television camera. The output of the ADC, in 
contrast, has defined levels or states. The number of states is almost always a power of 
two that is, 2, 4, 8, J 6 and etc. 
2.8.1 How does the ADC work'! 
ADCs require clocking and contain control logic including comparators, multiplexer nnd 
registers. This means that in order to get it to work, there is a total of seven control 
signals that must be sent from the FPGA. These are the address lines, A, B, and C 
Address Latch Enable (ALE), Clock, Start, and Output Enable (OE). There is also one 
control signal which is sent to the FPGA, it is the End of onvcrsion (EOC') signal. 
• Address T ,i)1es 
Because the chip has an 8 channel multiplexer there arc three address select lines: A 










Voice Recognition Literature Review 
·'•'•'•'·'·'·'·'·'·'·'·'·'·"•'•'·' •'•'•'•'•'•'·'·'·'·'·'·'·'· '•'•'•'•'•'·'·'·'··········································· ···············"''''''· ·· ,··· ···· ··· ' 
Address I .. inc 
!selected Analog Channel 




'•' ' ' 






L L H 
L H L 
1· 
iIN3 : L H H 
H L L 
t :: 
!INS : : 
'· 
H L H 
ifN6 
' ' ' ' ' 
. . . ' . . . . . . . . . ········· . ········ ..... 
H : H L 
! N7 .. ., . . . . .. . ' .. ,,, .... .. H: H 
' 
II 
Table 2.2: Channel Selection 
• ALE 
ALE is required to load the selected address lines into the ADC. Once loaded the 
multiplexer sends the appropriate channel to the converter on the chip. As with all control 
signals it is required to have an input value of Vee - l.5 up to 15V for a high and 1.5V 
down to ~O.JV for a low. The following control signals are used to control the con ersion. 
• Clock 
Thu clock i;j mal is required to cycle through the comparator stages to do the conv rsion. 
There arc 8, 8 clock c. clo periods ruc.{'Utr\Jd in order to complete an entire con r. ion. 










Voice Recognition Literature Review 
, , •.•·········································'·'·'·'·'·'·'·'·'·'·'·'•'·'•'·'•'·'·'•'·'·'·'·'·'·'•'•'·'·'•'•'•'•'·'·'·'·'·'·'·'•'·''' 
signal is received in the middle of an 8 clock cycle p -rind.) The clo .k should conform to 
the same range as all other control signals. The mnxinnnn 'lo ~k frequency is affected by 
the source impedance of the analog inputs. 
• Start 
The purpose of the start signal is to fold. On the rising edge of the pulse the internal 
registers are cleared and on the falling edge of the pulse the conversion is initiated. As 
clock speeds greater than that the user must make certain that enough time has passed 
since the ALE signal was pulsed so that the correct address is loaded into the multiplexer 
before a conversion begins. 
• OE 
The Output Enable signal causes the ADC lo actually output the digital values on the 
output lines. The ADC stores the data in a tri-state output latch until the nex! convert ion 
is started, but the data is only output when enabled. 
• EOC 
The End of Conversion signal is sent to the FPG t\ from the ADC. The signal goes low 











Voice Recognition Project Methodology 
'·'•'•'•'•'•'•'•'·'·'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'·'·'•'·'·'·'·'·'•' ·'·'·'·'•'•'·'·'•'·'•'·'•'·'•'·'•'·'·'•'•'•'•'·'•'•'·'•'·'·'''•'·············•,•,>,, 
Chapter 3 : Project Methodology 
3.1 Introduction 
VI:-IDL hierarchical design approach consists of both top-down and bottom-up design. Project 
can be implemented by either one methodology depending on the system requirements. Each 
approach has its own advantage that is useful to the system designer. In gathering 
information for developing the project, many techniques will be practiced such as 
discussions, surfing the World Wide Web and also brainstorming. The main development 
tools that have been used in developing a voice recognition system are XS tend Board version 
1..3.2 by Xess Corporation and also Xilinx's WebPACK ISE that is used to compile the 
VIIDL source code and generating necessary output. 
3.2 Design Method 
A bottom-up design methodology starts by defining the "low level" procedures then mo cs 
up towards more and more complex procedures using those already defined. Meanwhile a 
top-down design methodology is the software design technique which aims to describe 
functionality at a very high level, then partition it repeatedly into more detailed le els one 
level at a time until the detail is sufficient to allow coding. 
A top-down design methodology has been chosen because it is a natural way to approach a 
complex design task. II relics on multiple levels of abstraction to limit the number of 
independent concepts at each level of the design. Design is broken down into as many levels 










Voice Recognition Project Methodology 
·············································································································································································································································-·····-············· ·· 
--------------------------------- 
,' ..... ''• 
-. .: ·.· ..... ,, ,'••' 
.. ··.· ' ' . ','' . 
Cre.:ilil m 1fel'i for !!loch 
a11iJ 'ierify tl1is l~\tr-1 
: .. ,.,, .... · ... :.:.:.: .... :.: .: : ··.·. · .. ·: .. : .. :· ·.···· ·· , ; .. 
Figure 3.1 : Simplified top-down design methodology 
From Figure 3.1 we can conclude that a top-down approach system is divided into blocks. 
Starting with the highest level of abstraction, each block is then progressively modeled, 
interconnected, and simulated, until the design is broken down to its most fundamental level. 
The benefits of top-down design methodology: 
• Control over design intent 
• Concurrent design collaboration 
• Increased performance 
• Intelligent re-use of data 










Voice Recognition Project Methodology 
'·'·'•'•'·'-'•'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·'·'·'· '•'·'·'·'•'•'·'·'·'•'•'·'•'•'·'······ •'•'·'·'·'·'·'·'·'·'·'·'·'·'· '·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'·'·'·'•'·'·'·'·'·'•'·'·'·'·'·'•'·'·'·'·'·'•'· '·' '·. 
3.3.1 Discussion 
Discussion is the primary method used during the beginning of tho proje ·t. 
Discussions include the exact path and objectives of the project, Frequ ut visit 10 my 
supervisor has helped me a lot in developing th,e idea of enhancing the project, 
3.3.2 Brainstorming 
A brainstorming session has been done to generate better ideas after the project's 
objectives have been identified. Brainstorming is also very important in ensuring the 
project will cover most of the vital aspect of developing a voice recognition system. 
3.3.3 Library 
Library is the main source of the information. Books, journals, thesis and magazines 
are some of the useful materials that have been used as references. 
3.3.4 Internet 
Undeniable, Internet has become part of our life. It has become one of the main and 
fastest sources for information, Accessible from anywhere, information can be 
obtained easily from the Internet. However, validation has been made to ensure the 
reliability of the information. 
3.4 Development Tools 
3.4.1 XStend Board Vl.3.2 and XS 100 Board 
XStend Board Vl.3.2 (Figure 3.2) and XS 100 Board ( Figure 3.3) are chosen as the 
development tools of this project. The XStend Board contains resources such as the 
pushbuttons, DIP switches, LEDs, and prototyping area that are useful for basic lab 










·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·' , , , , ,., . 
Project Methodology 
However, their small physical size limits the amount of support circuitry they can hold. The 
XStend Board removes this limitation by providing additional support: circuitry thnt the XS 
100 can access through their breadboard interfaces. 
-----·-----------------~·------·~--------------------------------------------- .... ------------------·- 










Voice Recognition Project Methodology 
'·'·'·'·'·'·'•'•'•'•'•'•'•'•'•'•'·'·'·'•'•'•'•'•'•'· '·'·'•'•'•'•'•'·'·'•'·'·'·'·'·'·'•'·'·'·'·'·'·'·'·'·'·'·'•'•'•'·'·'·'·'···· ·······························································································'·'·'·'•'·•''''1•·····'" ,. 
---------------------------------------- ---------------- 















' !ld»rwo Ouiµut t SUmlc lni:w.t VGA 'M'ortltlar- Conneator- 
,Jll 
Figure 3.4: XSte.nd Board view 
These resources are shown in the simplified view of the XS tend Board and XS 100 Board 










Voice Recognition Project Methodology 
Table 3.1 : The functions of resources on XS tend Board and. XS 100 Board. 
Resources Functions 
LEDs The XStend Board provides a bar graph LED with eight LEDs·--- 
_______ - "" _ , _ - _ _ >1> _ - ,,.- .. - -,... ~ , _ _ - "o;"'"''-·-""""'"'"""*"'""' ~ - -- ... 
Switches 
I 
(D 1-D8) and two more LED displays (Ul and U2) for use by an 
XS 100 Board. All of these LEDs are active-low meaning that an 
LED segment will glow when a logic-low is applied to it. 
The XStend has a bank of eight DIP switches and two 
pushbuttons (labeled SP ARE and RESET). When closed or ON, 
each DIP switch pulls the connected pin of the XS 100 Board to 
ground. 
When the DIP switch is open or OFF, the pin is pulled high 
through a lo~ resistor. 
W11en not being used, the DIP switches should be left in the open 
or OFF configuration so the pins of the XS 100 Board are not tied 
to ground and can freely move between logic low and high levels. 
When pressed, each pushbutton pulls the connected pin of the XS 
100 Board to ground. Otherwise, the pin is pulled high through a 
101<, resistor. 










Voice Recognition Project Methodology 
'·'·'·'·'·············'·'·'·'·'·'·'·'·'-'·'·'·'·'·'·'·'·'·'·'·'•'•'•'·'·'· '•'•'•'•'•'•'·'•'•'•'·'·'·'•'·'·'·'·'·'•'•'·'•'•'•'•'•'•'·'·'·'·'·'·'·'·'·'•'•'•'•'·'·'·'·'·'·'·'·'·'·'•'•'•'·'·'·'·'·'·'·'·'·'·'·'·'·'· '·'·'·'·'·'·'·'•'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·'·'•'•'•'·'·'·'•'•'·' 
PS/2 Keyboard 
-- ----------.----- 
It provides a clock signal and a serial data stream that is 
----------------- - ------------------------------------ 





----- ------------- ---------·------------------ 
synchronized with the falling edges on the clock signal. 
111e chip-selects of the XStend Board RAMs arc connected to 
different pins so all the RAMs can be individually selected. 
values, and sends the digital values to the XS 1. 00 Board as a 
serial bit stream. The codec also accepts a serial bit stream from 
the XS 100 Board and converts it into two analog output signals, 
which exit the XStend Board through J10. 
3.4.2 XILINX WebPACK ISE 
The software that is chosen to support the programmable logic deign used in 
voice recognition is Xilinx's WebPACK ISE (Figure 3.5 ). Xilinx is the world's leading 
innovator of complete programmable logic solutions. WebPACK ISE is a software solution 
that contains support for advanced HDL entry, synthesis, simulation, and verificat ion 
capabilities for both CPLD andFPGA designs. (Figure 3.6 and Figure 3.7 )Wcbl>ACK TSh 












lE~E.BTD ~o~:: l:El.lL~1 
n:r:r .!'>Tri-r.o:::::-tir..1rr,,J!.:.r,: 
lEl!E. :::TD:r.o:: z::::cr;;; I:;NE:>. At.:.; 
i..;~ ... '..AJ.~~~.;. i.i: 
( .:;-;:(' hi 
1~u1.1.1.1· .. 




·::~~ \!i!ff'f ~.i,11IJ:rc:Ri:, n:::-_...,1: . 
. ~~::~i!f:5~\ltJ:.tt{i.~'.i 
!'": t':l'f#/ (I~,.. :':':'1·~.l:J't.'ilfl't•t11J fir- 









Voice Recognition Project Methodology 


























Voice Recognition Proposed Design 
'•''·'·'•'•'·'·'·'·'•'•'·'·'··········· '·'·'•'•'•'·'·'·'·'·'·'·'·'·'············ •'·'·'·'·'·'·'·'·'·'·'•'·'•'•'•'•'•'·'•'·'·'·'·'·'·'•'•'·'·' ·································· ······························· ·························· 
Chapter 4: Proposed Design 
4.1 Introduction 
Provided below is the general view of the process of developing a voice recognition 
system as a whole using Xilinx XC2Sl00 Spartan-Il FPGA. 
Voice Recognition Algorithm 
Serial/ Analog - ~ Input Reading and 
Overrun Detection End Point ~ Pattern -- ~ ~ Detection - Extraction , 
Analog-to-Digital 
Converter Codec Interface r 


















Voice Recognition Proposed Design 
4.2 Inputting Voice Signals through the Xstend Board V 1.3.2 Codec and 
outputting it to the LED 
The Preprocessing circuit will take a stereo input and output it through a voice 
recognition algorithm to the LED bars. The pre-processing block is used to adapt tile 
characteristics of the input signal to the recognition system. The stereo Codec on the 
XStend Board ( a product from Citrus Logic, CS4222 ) is capable of digitizing two 
analog signals to 20 bits of resolution while simultaneously generating two analog signals 
from 20-bit values. 
The complete processes of preprocessing speech signal in developing a voice recognition 
system are as follows: 
1) The :first stage will take the two analog inputs (which are typically the left and 
right channels of a stereo audio signal) enter the Codec and are digitized into 
two 20-bit values by analog-to-digital converters (ADCs) using delta sigma 
conversion technique, Discretization in time and amplitude of the signal will 
happen in this stage. 
2) Then the values are loaded into shift registers in the Codec, which are shifted 
out of a single pin of the Codec under control of a shift clock and a left/right 
channel selector control input. Figure 4.2 will give us the rough idea of how 
is the connection between the Xstend Codec chip and U1e XS Board FPGA. A 















''''''1•11 llllllljlfl 1111111101111 ••••.•ttllllt•••llt, . . . . 
lir.t.ar 
. Figure 4.2: Connections between XStend Codec chip and FPGA 
3) From Figure 4.2, we can see that the FPGA contain a set of shift register. This 
serial-in parallel out shift register will convert the serial input stream into 20 
bits values. 
4) Reading operation will be performed once the output is generated. Overflow 
of the FPGA shift register will also be detected here if they are not read in 
time. 
5) The speech or voice signal will be analyzed by the chosen voice recognition 
algorithm to separate the phoneme sounds to produce a detailed phoneme 
representation of the utterance. 
6) After the specific algorithm. has been performed on the voice signal, the voice 
signal will be sent to a 20- bit register (which is actually a latch) that will store 
the signal before it is sent to the LED decoder. 
7) Finally, LED will display the numbers uttered by the speaker, on the LED 
bars. Figure 4.3 will show the block diagram of the entire process from 






































Voice Recognition Proposed Design 
•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'··············· ································ 
4.3 Voice Signal Preprocessing 
Voice signal preprocessing or known as, front-end of the speech recognition system is the 
most important part in developing this system. The features here are transmitted over an 
interface channel to a completed "back-end" recognizer. The end result is that the voice 
transmission that does not affect the recognition system performance, 
4.3.1 Speech Digitization 
When an analogue signal is converted to digital form, it is made discrete both in time and 
amplitude. Discretization in time is the operation of sampling, while in amplitude it is 
quantization. 
• Sampling 
A fundamental thermo of telecommunications states that a signal can only be 
reconstructed accurately from a sampled version if it does not contain components whose 
frequency is greater than half the frequency at which the sampling takes place. Say, the 
sampling interval T seconds, so that the sampling frequency is I/T Hz. [ Ian, 1982 ]The 
sampling theorem states that the sampling frequency of a signal must be at least twice the 
signal frequency in order to recover the sampled signal without distortion. When a signal 
is sampled its input spectrum is copied and mirrored at multiples of the sampling 
frequency JS. Figure 4.4 shows the spectrum of a sampled signal when the sampling 
frequency JS is less than twice the input signal frequency 2:ID. The shaded area on the plot 
shows what is commonly referred to as aliasing which results when the sampling 











Recovering a signal contaminated with aliasing results in a distorted output signal. 
Figure 4.5 show the spectrum of an over sampled signal. The oversampling process puts 
the entire input bandwidth at less than fS/2 and avoids the aliasing trap. The analog signal 
is continuous in time and it is necessary to convert this to a flow of digital values. It is 
therefore required to define the rate at which new digital values are sampled from the 
analog signal. 'The rate of new values is called sampling rate. The sampling rate to be 



















Voice Recognition Proposed Design 
--------------·----~------- 
FREQUENCY f.ti fs. 
Figure 4.5 : Oversampled signal spectrum 
• Quantization 
Quantization is discretization in amlitude. This is performed by an AID converter 
which takes as input a constant analog voltage and generates a corresponding binary 
value as output. The simplest correspondence is uniform quantization, where the 
amplitude range is split into equal regions by points termed "quantization levels", and 
the output is a binary representation of the nearest quantization level to the input 
voltage. Typically l l-bit conversion is used for speech, giving 2048 quantization 
levels, and the signal is adjusted to have zero mean so that half the levels correspcnd 
to negative input voltages and the other half to positive ones. 
4.4 F.PGA (Field Programmable Gate Array) Circuitry 
The FPGA is handling these values in a bit-parallel manner, so the FPGA must contan a 
set of shift registers which convert the serial input stream into 20-bit values. This is 












input and output pins is synchronized with the same left/right channel select signal used 
by the Codec chip. In addition to the shift registers, the FPGA needs circuitry to read the 
signals and to indicate when they are full or empty. Since the Codec ADCs generate and 
consume data at a set sample rate, it is also necessary to build circuitry which detects 
overflow and underflow of the FPGA shift registers if they are not read in time. 
The FPGA circuitry can be decomposed into three modules: 
...,.. A clock generator module which outputs the serial data shift clock and the left/right 
channel select signals; 
...,.. A channel module which contains the shift registers, buffers, read/write control, and 
overflow/underflow detection circuitry for a single input/output stream of data; 
..... A top-level module, which combines the clock generator module with two channel 





Interface + Two Channels 









Voice Recognition Proposed Design 
·························· 









Figure 4.7: The top level design of the Voice Recognition system 
Once the Codec interface module is completed and packaged, we will use it in the voice 
recognition application. \Ve will use the FPGA in accepting the left and right stereo 
inputs from the Codec ADCs and then perform a specific voice recognition algorithm to 
the voice signals and then send it to a register 20 bits before output it as the LED signals. 
4.4.2 Operiltions in each module 
• Clock Generator 
A clock generator module is used to output the serial data shift clock and the left/right 
channel select signals. 
i) The tYPicaJly main clock input which is going to be used here is the 12 Mhz 









Voice Recognition Proposed Design 
ii) There will be an output that controls the activation of the left and right 
channel circuitry in the Codec and the FPGA. 
iii) The Codec chip requires that the channel duration be either 128, 192, or 256 
master clock periods in length. Thus, the total time to handle both channels is 
256, 384, or 512 clock periods. This sets the sampling rate. 
iv) So using a channel duration of 128 with a 12 .MH.z clock gives a sampling rate 
of 46.875 KHz ( 12 MI ( 128 X 2) == 46.875 K) that is sufficient for voice 
signal. 
v) Then channel selector will be output to the Codec. 
vi) The serial data shift clock is one-quarter of the master clock. So transmitting 
or receiving a 20-bit value will require 4 X 20 =80 clock periods, and this will 
fit within the shortest possible channel duration. 
vii) Finally a process of incrementing the sequencing counter and toggling the 
left/right channel selector will be performed when the count reaches the 
duration for which a channel is active. 
• Channel module 
It contains a shift registers, buffers, read control, and overflow/underflow detection 
circuitry for a single input/output stream of the data. 
i) There will a process of receiving serial data stream from tile Codec ADC that 












ii) A shift register and a flag will both be used in indicating the current status of 
the shift register. 
iii) The status of the shift register will change whenever it is accepting serial data. 
iv) The shift register status changes to full as soon as the last bit enters the shift: 
register. 
v) A flag is maintained that indicates whether the contents of the ADC shift 
register have been react The flag is set when the ADC register for the channel 
is full and it is selected for a read operation. The flag will stay set after the 
read operation is complete. 
vi) There will also be a process of monitoring and detecting an error condition of 
the ADC shift register and flags. This happen when the register begins 
accepting bits from the current sample period but the data from the previous 
period has not yet been read and causes the process overwriting of data from 











• Read operation 
Serial data from codec] 
ADC 
ADC shift register J 




Update the status of 



























• Codec Interface 
It is a top-level module, which combines the clock generator module with two channel 
modules to form a complete circuit. 
i) Once the clock generator module is instantiated, it will receive the 12 ]V.[Hz 
clock as an input and generates the left or right clock and serial shift clock for 
the Codec. 
ii) The module which handles the left and right channels is instantiated and this 
module will be activated for reading operation by the left or right selection 
input. 
iii) The overrun error is reported here when either one channel reports an error 
4.4.3 Overrun Error 
An ove:rrun error occurs when new data arrives when the receive buffer is full (there's no 
place to put the new data). The data that caused the overrun condition to be detected is 
lost and the last good data character that was received is flagged with the overrun error. It 
indicates whether new data sent in is overwriting the previous data received that has not 
been read out yet. The potential for data overruns, however, is always present. Data 
overruns must be guarded against since the overrun en-or condition can only be detected 
after one or more data characters have already been lost. The parity error,framing error, 
and overrun error indicate any problems with the current received data. 
Figure 4.9 and Figure 4.10 is to differentiate a normal data flow in a register and a Ilow 











a) The normal data flow in a register 






















Figure 4.10 : Register with overrun e.rror 
4.5 Shift Register 
There are only two basic ways for sending mid receiving digital data. These methods are 
known as parallel transmission and serial transmission. In parallel transmission, all the 
bits that make up a byte of data are sent at one time. The bits are all lined up in parallel. 
The other method for sending data from one device to another is serial transmission. 










Voice Recognition Proposed Design 
··································· ··································· '•'•'·'·'•'•'·'·'·'·'·'·'·'·'·'·'·'•'•'·'·'·'·'·'·'·'·'· ·············· ································· ·················· 
The FPGA is handling these values in a bit-parallel manner, so the FPGA must contain. a 
set of shift registers which convert the serial input stream into 20-bit values. 
A shift register is a register that capable of shifting its stored bits laterally in one or both 
directions. The 8-bit shift register has gated serial inputs and CLEAR. Each register bit is 
a D-type master/slave flip-flop.( Figure 4.11 ) Inputs A & B permit complete control 
over the incoming data. A LOW at either or both inputs inhibits entry of new data and 
resets the first flip-flop to the low level at the next clock pulse. A IDGH level on one 
input enables the other input which will then determine the state of the first flip-flop. 
Voice signal at the serial inputs may be changed while the clock is HIGH or LOW, but 
only information meeting the setup and hold time requirements will be entered. Data is 
serially shifted in and out of the 8-bit register during the positive going transition of the 
clock pulse. Clear is independent of the clock and accomplished by a low level at the 
CLEAR.input 
!tlC-'1!,\:, I Ao.~~- 
'-NP<l1:M I~ . 









Voice Recognition Proposed Design 
•'•'•'·'•'•'•'·'•'•'•'·'·'•' ·'·'·'·'•'·'·'·'•'·'·'•'·'·'•'·'·'·'·'·'•'·'·'·'·'·'·'•'·'·'·'·'·'· ··························· •'·'•'•'·'·'•'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'· '•'•'·'·'·'·'·'·'·'·'·'•'•'•'•'·'·'·'·'•'•'·'•'•'•'·'•'•'·'·'·'·'•'•'• ······················· 
4.6 Voice recognition algorithm 
The voice recognition system consists of five major modules, the speech processing 
module, the end point module, the pattern module, pattern. matching module and the 
decision module. 
Initially, the voice data is acquired using an external microphone. Voice input is then 
sampled at a rate of approximately 46 kHz and with 20 bits resolution and digitizes the 
voice at 22050 samples per second. Of these samples, each sample is converted to a 16 
bit digital representation. 
Once the signal is sampled, the digitized voice is then fed into a field-programmable gate 
array (FPGA), the Xilinx XC2Sl00 Spartan-II FPGA. The End Point Detector (EDP) 
block use to detect the beginning and end of the word pronounced by the user. A major 
cause of errors in voice recognition is the inaccurate detection of beginning and ending of 
a spoken word. Zero-crossing algorithm will be use in this module. 
Pattern extraction is the process that extracts a small amount of data from the voice signal 
that can later be used to represent each word. Initially, the sampled data will break into 8 
small periods of time; I call the period a "chunk". After that, a simple zero-crossing 
algorithm was implemented where a count of crossings on the entire sample was 













Pattern Matching involve the actual procedure to identify the unknown word by 
comparing extracted features from his/her voice input. Ill this module, the diiferences 
number of crossings between each related chunk template temporary with the templates 
permanent is found, and then all the differences are summed (Sum of Absolute 
Differences). At the same time, the differences between each related chunk is found and 
squared, and then. the total differences are summed (Sum of Squared Differences). The 
final step is get the difference between Sum of Squared Differences with Sum of 
Absolute Differences 
The decision. algorithm compared the results from the pattern matching. Initially, the 
algorithm will find the minimum value between the results. The minimum value is then 
compared against a threshold value. If the minimum value is less than the threshold, then 
that word is chosen as the recognized word otherwise the incoming word is deemed 
invalid and ignored. 
4. 7 20- bit Register 
All the bits of the register are loaded simultaneously with a common clock pulse. The 
symbol provided below permits the use of the register in a design hierarchy. It has all of 
the inputs to the FPGA circuit on its left and all of the outputs on the right. The inputs 
include the clock input with the dynamic indicator to represent positive-edge triggering of 
the flip-flops. Note that the name clear appears inside the symbol, with a bubble in the 










Voice Recognition Proposed Design 











Figure 4.12 : A symbol of a 20 bits register 
4.8 LED Decoder 
7-Segment LED 
. . S6 Dl Sl 
. .... 
" S2 
S5 S4 . . 
D2 
LED S3 
Decoder - . " " S4 S3 
. ' ,__. ~ 
Sl D3 S5 S2 
. . . 
" S6 
so - " 
so 









Voice Recognition Proposed Design 
•'·'·'·'·'·'·'·'·'·'·'·'·'·'·' ························ ·'·'•'•'•'·'·'·'·'·'·'•'-'•'·'·'•'•'•'•'•'•'·'•'•'·'•'•'•'•'•'•'•' ·'·''·'·'·'·'·'·'•'·'·'·'•'•'·'•'·'·'·'·'·'·'·'·'·'•'•'•'·'·'·'·'·'·'·'•'· '·'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'•'•'•'·'·'·'·'·' 
An LED decoder takes a three bit input from the Decision Module and outputs seven 
signals which drive the segments of an LED digit. The LED segments will he driven to 
display the digit corresponding to the value of the three input bits as follows: 
Table 4.1 : Number display by the LED segments 
3 Bit Input 7 Bit Output Number LED Display 
000 1110111 0 ;,._ '.I 
I- 
001 0010010 1 ~ ' 
010 1011101 2 2 
011 reserve reserve reserve 
100 reserve reserve reserve 
101 reserve reserve reserve 
llO reserve reserve reserve 


































Chapter 5: System Implementation 
5.1 Introduction 
This chapter will present the Preprocessing Voice Signal system's implementation 
based on the design pre-described in Chapter 4. Also presented in this chapter is the 
description of the software that is chosen to support the programmable logic deign used 
in voice recognition which is Xilinx's WebPACK ISE. It is our main implementation 
tools. Together with the software, the most common hardware description language 
VID)L will be used to describe the system structurally and behaviorally. Then, this 
chapter will list out the steps taken throughout the implementation process including the 
approach taken. 
Chapter 5 continues with the pin description of the modules covered at the top 
level design. This chapter ends with the explanation of the VHDL description of every 












5.2 XILINX WebPACK ISE 
The software that is chosen to support the programmable logic deign used in 
voice recognition is Xilinx, the world's leading innovator of complete programmahlo 
logic solutions. WebPACK ISE is a software solution that contains support. for advanced 
HDL entry, synthesis, simulation, and verification capabilities for both CPLD and FPGA 
designs. WebPACK ISE modules provide complete design implementation control. It is 
advanced software that integrates with VHDL for describing digital designs through cm 
integrated VHDL simulator, Model Sim XE that comes with the product. Xilinx's 
WebPACK ISE can also be used in combination with other tools including schematic 
editors, synthesis software, high-level design tools and other tools available from third 
parties to form a complete design environment. 
Xilinx's WebPACK ISE 
It provides many useful features to help crate, modify and process VHDL 
projects. Some of the main features included» 
• Hierarchy browser 
o shows an up-to-date view of a design structure. 
o useful for projects involving multiple VHDL source file called 
modules and/or multiple levels of hierarchy. 
• Module and test bench wizards 
o helps crate a new VHDL design descriptions by first asking •1 










Voice Recognition System Implementation 
Wizard that generates VHDL source file templates based on the 
requirement pre-defined, 
• Built-in dependency feature 
o helps to streamline the processing of a design for simulation and 
synthesis 
o eliminate the need to compile each VHDL source file in a design 
or to keep track of source file dependencies. 
The interface ofXilinx's WebPACK ISE consists of menu bar options which users click 
on to perform the required action. Using Xilinx' s WebP ACK ISE, users can either create 
entirely new projects or manage existing VI-IDL projects. In Xilinx's WebPACK ISE, the 
design flow starts with creating new project or opening an existing project (Figure 5.1). 
The browser will list out the VHDL source files. Double clicking on any of the file listed 
will open the editing window for that particular source file (Figure 5.2). The next step is 
to compile the source file or in other words is to check for any errors occurs in the source 
code. Each of the compiled source files will be Jinked together and an executable 
simulation file will be generated. Project can be synthesized, to be implemented in design 










Voice Recognition System Implementation 
Figure 5. 1: Xilinx' s WebP ACK ISE interface on 'File' menu bar options 
~~.m~.1~l\1i'.?111~1.mmillt~1~1~r?$1ftt£tJJ:t~~r:1~if~1~'.l~4~~1r~~{~1~.1\lif1;,\lf«~r~~~w~;~n%~tt~~1»%i~~;:r:~1~1~ :1::~',;,!-~{~t~~:;::~::'~'.~g;;;:;y;;;;@c;;;;:!fa0'i'o"c:; }:i:ii!lt; 
': ·. i5] test_horne ., , rru:std~10-.1c:uns1~~~:~ur.1 
- 0 •c2sl00·51:q144·XST VHDL .: 013•••••••0 : · T .. \WCodoclnl1.vhcl 
· ... fij) .. lt<ICodeclnt vhd 
.-. 0 cl1anrtel [ .. \ch.amel.vnd) 
jg) .. \tslchannel_ ovenun vhd 
. . . liJ't .. \tstchannel readout. vhd 
.f:i~~~M,ci.~~J);~,~~P.~~fi~;~'•Hl(S~f:~:~~ 
E:l!Tl'l~\! codecr_intfc :r;J 
C.f:llf.n.l:G 
( 
ADC_WIDTH: t>Qt::it.1.v1:. : • 20; 
CllUlNEL_DURATION: '''' ~;l. U ve 
); 
.~ .:. :"; 
SynU>esiie 
Implement Design 
Generate P1<lg1amminsl Fil() 










Voice Recognition System Implementation 
Model Sim XE 
The executable simulation is loaded to Model Sim XE application program in order to 
simulate the source files for functional validation. Any errors that occur during 
compiling, linking or simulating will be reported back to Model Sim XE 'Transcript 
Window' (Figure 5.3). The Transcript Window collects and displays messages generated 
by the Model Sim XE compiler, linker, simulator and other functional programs. 
ti ~;;.:.:;;;~;:.:;;;;-.;;..~;.o;.;.;~;..;.;.."'-'...._..-.+-""'.f' lido 
It ModelSim XE vcom 5.5e_p1 Compiler 2001.11 Nov 1S2001 
ft·· package 8tandard 
11 .. Loading p~ckage std_logieo_'l 164 
ti ·- Loading package std._logic_ arith 
II·· Lo~dil1\l package std_lo)lic_•,n;igMd 
ti .. Compilil'lg enlity chat111el 
II Model T1:1chnoloi;zy ModelSimXE vcom 5.5e_p1 Compuer 2001.'l 1 Nov 16 2():n 
It·· Lo~ding package stmdard 
ti .. Cornp~ing architecture ohahnel_a1ch ol ohermel 
It .. Lo~ding pockage std_logic_'I 164 
II .. LoadrnQ package std_logib_a1ith 
ff. .. Lo.~ding packa11e ;td_loQic_•;migned 
~· .. Loading enl.i\jJ channel 
IH>lodi:!I T echnoloi;w Mod~ISimXE vcem 5.5e_p1 Compiler 2001.'l I Nov 16 2001 
It .. Lo1Jdi119 pockage ~tllndard 
tt .. Lo~di~ packe11e ;td_logic_ 1164 
· 11 .. Loading package num~rlc_tld 
ti .. Compilin~ entity taalbenoh 
1:1 Model Technology ModelSirr, XE vcom 5.5e_p1 Ccmp•er 2()01.11 ffov 'IG 2001 
~ .. [.oadin!J paokagG stsrderd · 
t+ .. Compiling architeclo.He behavior ol testbench 
It .. Loading packaQe 3td._logic_ 1164 
ti .. Lo~din~ packitQe n<rmeric_~td 
II ·· l..Q.~ding entity testbench 
11 .. LNding packa11e ;t<!_logic_-!!rith 
ti .. Leading pack11ge ~td_logic:_w11$igr~ed 
ti .. Lo~din~ channel 
It v,1im ·lib -l, •llin~corelib testbe11ch 
c!i.Hlfill' lic1:1.n:1\"' ~\M't. 1·.h:::li:.~eb)d dnd .will bt':! li.')ed. e.'n.m \tli:.11;1yh ;iou h ... w,;i it\!;1:-:Mlt!d ~.11)1 k:-!Sim 
1;,1 :·iC b 01•:kr w ~c.:(.~'1.1 Ho.:fo6(1,1'.>·:E'~ !\ll! 'cep~tlJ!~ii.;.lf L~~dlr.;i C:iMOt:OEI..H:c 









Voice Recognition System Implementation 
·'·'·'·'·'•'•'·'•'•'•'•'•'•'•'•'•'•'·'·'·'·'·'''·'· '·'·'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·' 
5.3 VHDL and Preprocessing Voice Signal 
Previously in Chapter 1, it is mentioned that the scope of the system will 
determine the range and how the system will work. The main scope of the project is to 
design a voice recognition system in terms of pre processing the voice signal and then 
implementing a best voice recognition algorithm and at the same time generating VHDL 
source code. Individual modules will be defined in terms of its functions and 
interconnections between them. 
VHDL is a language for describing digital systems.] Douglas, 1998] .Such 
descriptions can be used for simulation of the system behavior using the simulator 
without having to actually construct the system. Alternatively, synthesis compiler can 
utilize such a description for creating descriptions of the digital hardware for further 
implementation of the system onto a FPGA's chip. 
VHDL will be used to describe this system behaviorally and structurally. 
Behavioral descriptions are necessary early in the design so that simulations can be done 
to ensure that the design is functionally correct. This functionality may then be verified at 
more than one levels of abstraction. This gives and advantage because any changes can 
be made totally independent of the physical implementation, thus reducing time and cost. 
Next, the design can be translated to a structural description composed of its major 
components. Simulation at this stage of description would ensure that structural design 
correctly performs the intended functions using the design major components. While 
there are many hardware description languages prior to VITDL, most of them are 
developed to serve the simulators that run them. VI-Il)L, on the other hand is technology 










Voice Recognition System Implementation 
methodology on a designer thus making a standard and suitable language, offering 
benefits over other hardware description languages. Chapter 3 has pointed out the 
characteristics of VHDL together with its advantages. 
5.4 Implementation Steps 
The implementation steps were divided into three main stages. First stage is referred to as 
the 'implementation stage'. It involved in writing the behavioral VHDL description 
program called the VHDL source code for each portion of the partitioned design in the 
form of a module. The whole design of the Preprocessing Voice Signal consists of three 
major modules. Next, is to compile and debug the VHDL codes written. If any errors 
occur during the compilation, then fixing the elTors (or bugs) will be necessary before 
continuing with the further steps. This step is important in order to ensure the correctness 
of the VF:IDL codes. 
The second stage is referred to as the 'simulation stage'. A script of 'test bench' 
will be written tor each VHDL codes produces in the implementation stage. The purpose 
of test bench is to allow the simulation process to take place and establish the clock 
generator circuit. Following next is the simulation process mentioned to check for 
functionality of each module available in the design. Lastly, is the functional verification 
of the modules that includes if there is any module malfunctioning. 
The third and last stage is the self-test stage. The circuit will undergo the test. 
process simulation, which put the circuit under testing mode. Throughout the stage, the 
steps involved are as follows: 










Voice Recognition System Implementation 
• generating test vectors 
• comparing the output signature with a known good signature using test 
bench. 
5.5 Modules Pin Description 
Front-end of voice recognition system consists of 3 main modules and 2 small 
modules: 
a) A clock generator module which outputs the left/right channel select signals 
b) A channel module which contains the shift registers, buffers, read control, and 
overflow detection circuitry for a single input stream of data 
c) A top-level module, which combines the clock generator module with two 
channel modules to form a complete codec interface circuit. 
d) The clock divider module used for the purpose of slowing the main clock input 
e) And the LED Decoder to generate led numbers output 
5.5.1 Top level pin description of Preprocessing Voice Signal 
Figure 5.4 below describe the input and output pins of the main modules of the 
Preprocessing Voice Signal system design. Overall there are 5 input pins and 6 output 
pins incorporated with the design. Each of the input and output pin has its own design 

















A de_ outc._rd)\ 
Reset Clk 
Irse Reset CLKGEN 
ubcycle_cntr Adc __ out. 
Bit_cntr ----1 
Chan_on CHANNEL 






Figure 5 .4: Top level pin description of Preprocessing Voice Signal system 
5.5.2 Codec interface module Pin Description 
It is a top-level module, which combines the clock generator module with two channel 





















Voice Recognition System Implementation 
Pin In/Out _Q~~~!ption _________________________________ 
c--- 
elk In Clock 
The main clock input, 12 :MHz clock from the XS 
Board. 
Use for timing purposes. 
reset In Synchronous Reset 
Synchronously resets the counter of the two channel 
modules and the clock generator. 
lrsel In Left/Right Channel Selector 
This input selects either the right or left channel for 
parallel read operation 
rd In Read 
Read from codec ADC 
sdout In Serial Data Out 
The serial data stream from the codec ADC is shifted in 
through this input. 
lade_ out, radc _out Out Left/Right ADC Output 
The bits stored in the left and right ADC shift registers 
are read out in parallel through these outputs 
lade ___ out __ rdy, Out l,eft:/Right ADC Output Ready 
rdac _out _rdy These outputs go high after all the bits have been shifted 
from the codec into the left or right ADC shift register, 
respectively. 
adc overrun Out ADC Overrun 
This output goes high if new serial data is shifted into 
either the left or right ADC shift register before the old 
contents have been read out through the parallel outputs. 
Selk Out Serial Data Clock 
1 he clock used for synchronizing serial data transfer 
between the FPLD and codec 









Voice Recognition System Implementation 
'·'·'·'·'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'•'•'•'·'·'·'·'·'·'·'·'·'·'·'·' 
5.5.3 Channel module Pin Description 
It contains a shift registers, buffers, read control, and overflow detection circuitry for a 














Pin In/Out Description 
elk In Clock 
The main clock input, 12 MHz clock from the XS 
Board. 
·---------------------------- ----------- --~!~l~! timit~gJ~~Y-o~es'.--------··---------------- 
reset In Synchronous Reset 
Synchronously resets the channel. 
chan 011 In Channel On 
A high level on this input activates "111t:: channel. 
This input is usually controlled by the left/right channel 
selector. 
bit cntr In Bit Counter 
These inputs inform the channel of the index of the 
serial data bit currently being transmitted and received. 
- 
In Subcycle Counter subcycle _cntr 
The duration of each serial data bit is divided into four 
phases and this input indicates the current phase from 
clock generator module. 
chan_ sel In Channel Select 










Voice Recognition System Implementation 
---------------~- ------------- -~----·----------- the shift registers be read. 
rd In Read 
A high level on this input outputs the value stored in the 
shift register connected to the ADC. 
sdout In Serial Data Out 
The serial data stream from the codec ADC is shifted in 
through this input. ·------ adc out Out ADC Output 
The bits stored in the ADC shift register are read out in 
parallel through these outputs. 
adc out rdy Out ADC Output Ready 
This output goes high after a11 the bits have been shifted 
:from the codec into the ADC shift register. 
adc overrun Out ADC Overrun 
This output goes high if new serial data is shifted into 
the ADC shift register before the old contents have been 
read out through the parallel outputs. 
Table 5.2: Channel module pins description 
5.5.4 Clock Generator module Pin Description 





Reset _ Bit cntr 
Pin In/Out Description .. 
elk In Clock 
The main clock input, 12 Mflz clock from the XS 











'•'·'·'·'·'·'·'·'·'·'·'·'•'·'·'•'•'•'·'•'·'•' ··········································································· ······················'•'•'·'•'•'•'•'•'•'·'•'•'•'·'·'·'·'·'·'•'·'·'•'•'•'•'•'•'•'·'•'•'·'·'·'•'·'·'·'·'·'·'·'·'·'·'·'·'·'•'·'·'·'·'·'·' 
System Implementation 
~----- ------- --- -----------·------------- 
Use for timing purposes. 
reset In Synchronous Reset 
Synchronously resets the counter of the clock generator. 
lrck Out Left/Right Codec Channel Select 
This output controls the activation of the left and right 
channel circuitry in the codec and the FGP A. 
bit cntr Out Bit Counter 
These outputs indicate the current bit being 'transmitted 
and received in the serial data streams. 
subcycle _ cntr Out Subcycle Counter 
The duration of each serial data bit is divided into four 
phases and these outputs indicate the current phase 
Selk Out Serial Data Clock 
The clock used for synchronizing serial data transfer 
between the FPLD and codec 
Table 5.3: Clock Generator module Pin Description 
5.5.5 Clock divider module Pin description 
This smaller module is generated to produce a slower clock of l Hz that will be used in 
implementing the voice recognition algorithm. 
clock_12Mhz; 
Clock Divider 







clockj l Hz 
-------- --------------------- 
Pin In/Out Description 
clock_12Iv1hz In Clock 12 MHz 
The main clock input, 12 MHz clock from the XS Board. 
Use for ti.min~ purposes. 
clock_ 1 MHz, Out Clock 1 MHz, Clock 100 KHz, Clock 10 l(Hz, clock 
clock, 100KHz, 










Voice Recognition System Implementation 
clock_ 1 OKHz, 
-----------~----......- ------------------------------------ 
Output the divided clock respectively to give us a slower 
clock of 1 Hz. 
clock_ 1 KHz, 
clock_100Hz, 
clock_ 1 OHz, 
clock_1Hz 
Table 5.4: Clock divider module pins description 





... S6 Sl 
. - . 
S2 
. S5 S4 
D2 
LED S3 





S5 S2 SI 
- - . ... 
S6 
so ... 
Figure 5. 5 : High Level Diagram of the LED Decoder 
An LED decoder takes a three bit input from the Decision Modulef from Voice 










Voice Recognition System Implementation 
•'•'·''•'·'·'·'·'·'·'·'·'·'·'•'·'·'·' •'•'·''•'•'•'·'•'•'•'·'•' 
digit. The LED segments will be driven to display the digit corresponding to the value of 
the three input bits as mentioned in Chapter 4. 
5.6 Writing VHDL code 
Using VHDL, description of each module in the Preprocessing Voice Signal 
design starts with an entity statement at the top-level of the VHDL hardware specification 
hierarchy. The entity statement defines the input and output ports together with the 
direction of each signal flowing through each port. Associates with entity are the 
architecture that defines one particular implementation of a design unit, at some level of 
the abstraction. For this system, behavioral architecture description containing 
algorithmic statements( example: condition, assignment and loops statements) will be 
applied within a process statement. The process statement: is an independent sequential 
behavior that executes concurrently with other process statement in the architecture. Also 
applied is the instantiation of predefined components within the architecture, The overall 
design of this thesis consists of 5 VHDL modules namely: 














Voice Recognition System Implementation 
5. 7 Test bench 
For the top level entity, a test bench is created as a new module and added in the 
Xilinx's WebPACK ISE project hierarchy containing all the Preprocessing Voice Signal 
system's source files. The test bench module instantiates the top level as a unit under test 
and is used to drive the input pots and subsequently read the output ports of the top level 
during simulation. Simulation of each module of the design was performed to validate the 
functional behavior of each portion as well as for the top level. Normal and testing 
operation will be done to validate functional behavior during system testing phase. TI1e 
simulation process and the related results will be discussed in detail in the next chapter. 
BEGIN 
uut: channel PORT MAP( 
elk= elk, 
reset=> reset, 
chan on => chanon, 
bit cntr =e-bit cntr, 
subcycle_cntr ;;;> subcyclecntr, 
ch1u1_scl => chan; sel, 
rd=> rd, 
adc out=> adc out, 
ooc=out_rdy ,,,>-ooc_out_t·dy, 
adc __ overrun => adc __ ovem111, 




Clock_cycle <= Clock cycle + 1; 
Clk<= 'l'; 
wait for 41.5 ns; 
Clk<='O'; 
wait for 41.5 ns; 
end process; 
-- "'"'~'Test Bench - User Defined Section"'*"' 
tb : PROCESS 
BEGIN 




- subcycle ; cntr<: '10"; 
ch;m_sel<"''O'; 











Chapter 6 : System Testing 
6.1 Introduction 
So far discussions covered presentation to the reader on how the }>rep1·ocessing 
Voice Signal has been implemented. Continuing in. this chapter .is to present reader with 
the systems testing or system validation in order to make sure the system is functioning 
as expected theoretically. System testing involves simulation process towards all the 
modules modeled in VHDL descriptions. This chapter basically discusses on all the steps 
considered during system testing beginning with simulating smaller and end at the 
simulation process of high-level module integrated with smaller modules. 
6.2 Design Simulation 
Design Simulation starts with checking for any syntax error in V.HDL codes 
written. If there is any syntax errors, modifications must be made to the source codes and 
the process take place repeatedly. The next step is to link all the source files followed by 
simulating the fi]es linked together. Simulation process is carried out using Model Sim 
Simulator. Simulation results will be generated in the form of output waveforms of the 
VHDL simulation run and displayed in the form of output waveforms of the VID)L 
simulation run and displayed in the Model Sim 'Waveform' display window for system 
validation. 
6.3 Clock Generator module Simulation 
Clock Generator module is use to output U1e foil/risht channel select si mals, 










Voice Recognition System Testing 
Selk (Serial Interface Clock). In the test bench we start the simulation with the left 
channel being selected first. (lrck input port is set to 'left') 
'l ~ (I P 
,' 1 1 
. ' . 
1 0 
: 1!l.iVSa. ntSB 
,...,., ...... '.r-:, ...... 
Figure 6.1: Timing diagram of the main clock used in CLKGEN.VHD 
In the module, there is a process to increment the sequencing counter and toggles the 
left/right channel selector when the count reaches the duration for which a channel is 
active. The codec chip requires that the channel duration be 128 clock periods i11 length, 
Therefore, to handle left channel alone requires 128 clock periods and so does the ri ht 
channel. This means that the left channel will be active for 128 clock periods. Aller 128 
clock periods, the channel will be toggled, thus the righl port will then be high. Thus, the 
total time to handle both channels is 256 clock periods and each port will take turn to be 
high for 128 clock periods. (Figure 6.1) 
The Bit Counter output port that is used in this module is a G-hit port. To complete a 
single channel duration of 128 clock periods, it will output the position of tho current data 
bit in the serial stream in the length of 32-phasc, where each phase COUl{il'ILH of -l-subc 11\; 
counter. Thus, to handle both channels a 6-hil port of hit counter is us xl, ' here 









Voice Recognition System Testing 
·'·'·'·'·'·'·'·'·'························································· '·'·'·'·'·'·'·'·'·'····························· ·················'•'•'•'•'·'·''•'•'•'·'·'·'•'·'·'•'·'·'•'•'•'•'·'•'•'•'•'•'·'·'•'•'•'•'•'•'•'•'·'•'· '•'•'·'·'·'•'•'•'·'·'·'·'·'·'·'·'·'·'·'·' 
The Subcycle Counter output port is used to output the position within a bit. It will be 
given by the two LSB of the sequence counter. 
The Selk is one quarter of the main clock. Thus, transmitting a 20~bit value will require 
4 *20= 80 clock periods 
This will fit the shortest possible channel duration. 
The expected simulation result of CLKGEN.VHD is shown below. ( Figure 6.2) 
Figure 6.2: Simulation result of LKGEN. VIID module 
6.4 Channel module Simulation 
There are 3 main processes in this module: 
• Receives data from codec AD 
• Handle reading of ADC data from codec interface 
• Detect overrun process 
6.4.1 Receives data from codec AOC 
This process is to receive serial data from the \) ' in the codec. Th ADC shill 









Voice Recognition System Testing 
contain all the bits from the ADC. Once the reset is removed and the channel is on/ active 
. (Channel port= 'l' Sdout port= 'l '), bits are shifted into the register. 
Bit" 1,2, ... , up to the width of the ADC, (ADC width= 20) data value arc pushed into 
the shift register. Then the shifting stops. Bits are shifted into the register during the third 
subcycle of each bit period (the subcycles are numbered 0, l, 2 and 3 where each one is 
equivalent to "00", "01", "10" and "11"). This is because serial data bit will have a plenty 
of time to stabilize during the first two bit period, and at the last period, the Subcycle 
Counter port need to prepare for an ADC shift register read process. The AK4520 (the 
ADC integrated chip in XS board) outputs data on the SDTO pin on the falling edge of 
SCLK and it accepts data on the SDTI pin on the rising edge of S LK. (Figure 6. l ) The 
subcycle counter divides each serial bit into four phases. When subcyclecnrr = 2, then 
this is halfway between a falling edge of SCLK and a rising edge of S LK. This is an 
appropriate instant for the FPGA to get any data output by the AK.4520 on SDTO and 
send new data to the AK.4520 through SDTI. There will he no outpuuing and inputting 
serial data at the right time if the trigger value of the subcycle _ cntr is change. 
The shift register is marked as 'not full' as soon as a single bit is shift d in so that the 
value will not be inadvertently read. The shift register status changes to full as soon as the 
last bit enters the shift register. 
For the first test bench ( TSTCHANNEL_ RCV AD ·~.VI lD), I~ uin , will b\j done bused 011 










Voice Recognition System Testing 
the shift register will be output at the adc_out port in a parallel format These outputs are 
not latched and will change as bits are shifted into the register. (Figure 6.3) 
~.y;£.~fA'.f!fil.--?r.~t~ibS·ci{~~~$~~;'.,;~~;;:;WiitJ~1-t.~f.~~%>:~ti1~~.ill!.Ct{tn:~?.::;_;;_r:;:J~:}~'.(~::.~i2!t~:~ .. £r,_~:;::;;i"/11.t·}?. ;~;-;~r.:.~a'.-;C{Xtti\T(/~"W,?~: ·7. ,.·. :1_ ·-1, 11. , ~ ., 1/ 1. 
f:)i:: ::~:•-ff-.:-u1 £:::·-=:::r: t!.:::;.::1·~.:r:-:;»~··. F)::r:-~~: :ztr(i.fo:i·:~ 
Figure 6.3: Simulation result ofTSTCHANNEL R VAD '.VITI) 
For the second test bench, (TSTCHANNEL_ RCV AD l .VJ D)) the different value of 
every input port will be set in order to test U1e functionality of the Channel module. From 
the simulation result (Figure 6.4), it shows that the chan_on and sdout port must be set to 
'high' value in order to make sure the bit shifting process is being done. As mentioned in 
Clock Generator module, the value of Subcycle_cntr must also set lo the value of 2('1 O' 
in bit) to make sure that there will be an inputting of serial data. Meanwhile, the ch:rn_scl 
and rd input port has no effect in the receiving data process. This is because both of them 









Voice Recognition System Testing 
Figure 6.4: Simulation result ofTSTCHANNEL_RCVADCl.VHD 
6.4.2 Handle reading of ADC data from codec interface 
As describe in Chapter 4, a flag will then be maintained to indicate whether the 
contents of the rIDC shift register have been read. The flag is set when the AD , register 
for the channel is full and it is selected for a read operation. The flag will slay so: after the 
read operation is complete. Reading the register does not empty it. The shift register is no 
longer full only when the first bit of the next sample is shifted into it. This wilt reset the 
read flag. 
The read jrdc process will updates the flag that indicates whether the AD shift register 
has been read, A status output is asserted when U1e data in the ADC shift register is ready 
for reading. Reads are permitted when the register is full and has not yet been read. 
Register is full when it reaches the width of AD . Thus, \ hen the reading, pre cess is 
executed, the adc_out_ready output port will be on 'hi> t' value, This is shov n in the 









Voice Recognition System Testing 
Figure 6.5: Simulation result ofTSTCHANNEL READ UT.VI-ID 
The second test bench will show that the adc_out_rcady output is cleared as soon as a 
read occurs or new data is shifted into the register. This happen when the chnn_sel port= 
'1' AND Read port= 'l '.Besides that, the result also shows that other input ports have 
no effect on the adc_out_ready output. (Figure 6.6) 










Voice Recognition System Testing 
·'·'·'·····················'•'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'·'· ·································'·'·'·'·'·'·'·'·' ························-··············································'·'·'•'·'·'•'•'·'·'•'•'·'·'·'·'·''•'·'·'·' •'•'•'•'•'·'·'•'·'•'•'·'·'·'•'·' 
6.4.3 Detect overrun process 
This process monitors the ADC shift register and flags an error condition if the 
register already full but it still begins accepting bits (when bit counter l) from the 
current sample period where the data from the previous period has not yet been road, 
which means that the adc_out_ready output port is must be still on 'high' value. When 
this condition happens, adc_overrun port will be on 'high' value. (Figure 6.7) 
~}Mdri~Ji~fuf;~\1\~rit1f~J.;f;i;-t~~;gyr,.~~kf;{~:·;5~1,'.<,:?~t:~.~l~:~t~Wt.~~~ti!.:1.:.wmW.iiNdN,\:;;ft.:~);;~'i.)R~·f-ii?~'.~~·,·H~)i):!t!:l~U-?.~·'..~i}:·•}J'>;,r~ ':'i1T~JF1'. ··.;1v·~,:·:, ~,I. ; '' , .1• 
~;:~.:.:::·<.'n ;~9:m·'.'.1~ :':t/ind-:~:/,1 
Figure 6.7: Simulation result ofTSTCHANNEL VElU~.UN.VH. 
The second test bench shows that when the register has an overrun error, <my other input 
bit will have no 'curing' condition to correct the error. (Figure 6.8). Therefore, overrun 
error must be guarded carefully by the system user. 









Voice Recognition System Testing 
•'·······'·'·'·'•'•'•'•'•'•'•'·'·'·'•'•'•'•'•'•'·'·'·'·'· ·····················-······-···········-···············-························'·'·'•'•'•'·'·'•'·'· '·'·'·'·'·'·'·'·'·········· 
6.5 Codec Interface module Simulation 
Codec Interface is a top-level module, which combines the clock generator 
module with two channel modules to form a complete circuit The 3 main processes that 
occur in this module are as follows: 
6.5.1 uO process 
One clock generator module will be instantiated. It receives the 12 MHz clock as 
an input and generates the left/right clock for the codec. It also outputs the position of the 
current bit in the serial stream and the current cycle within each bit period. 
The input signals to the codec on the XStend Vl.3 Board will pass through inverters. 
Therefore, the clock signals are inverted on these lines to remove the effect of the 
inverters. 
6.5.2 u_lefl and u_right process 
This module, which handles the left channel of the codec, will be instantiated. 
This module is activated during one half of the left/right clock period. It is selected for 
reading by the left/right selection input. The test bench starts the simulation test with the 
left channel being selected first for read operation ( lrsel input port is set to 'low'), 
The Read input port must set to 'low' because the AD data is ready when the register is 
full and hasn't been read yet (Figure 6.9) Thus, the ladc_out_rcly output port will be 
'high' right after au the hits have been shilled from the left channel of codec into the 
ADC shif register, After all the bit~ have been shifted from tho loll channel, the 









Voice Recognition System Testing 
The lrck output port will then be 'high' for 128 clock periods representing the right clock 
signals being activated, effect from the inverters. After the first 128 clock periods, the 
lrck will be 'low' for another 128 clock periods. This goes on until the whole process of 
simulating ends. 
Figure 6.9: Simulation result of TSTCODEC INTF .VHD 
6.5.3 Overrun error 
The overrun en-or indicator for the codec interface is formed by the logical-OR of 
the associated error outputs of the left and right channel modules. Thus an en-or is 
reported if either channel reports an error. 
6.6 Clock Divider module Simulation 
Using test bench, a slower clock running at l Hz can be produce from thu clock 
divider module. If we use one huge synchronous parallel counter, all the bila will chan e 
simultaneously, causing a huge power pulse that will cause the chip to malfunction. The 
solution is to use a small synchronous counter v ith ,, C\ bits to di id' the cl 't· l wn to 










Voice Recognition System Testing 
lHz. The ripple counter allows each bit to change at different times, eliminating 'the big 
power pulse and allowing the chip to function correctly. The simulation result can be 
seen from Figure 6.10. 
Figure 6.10: Simulation result of TSTCLOCK_DIVIDER.VHD 
6.7 Led module Simulation 
An LED decoder takes a three bit input: from the Decision Module (from Voice 
Recognition algorithm) and outputs seven signals which drive the segments of an LED 
digit, In the test bench, there will be onJy 3 value tested which arc 0, 1 and 2. The 
remaining will be kept for reserve used. ( Figure 6. 1 I) 









Voice Recognition System Evaluation 
·················································································································································································,··········· 
Chapter 7: System Evaluation 
7.1 Introduction 
This chapter presents the Preprocessing Voice Sign.al systems evaluation. System 
evaluation includes pointing out the problems encountered during system implementation 
and testing together with the solutions considered in solving the problems. This chapter 
also list out a number of system strengths as well as the constraints related to the 
Preprocessing Voice Signal system. Ideas and theory on how to improve the existing 
system design are also included as for future enhancements considering the recent 
technology could be applied to the system in future. Lastly, it will also cover on the 
knowledge and experience gained throughout developing this system. 
7.2 Problems Encountered and Solutions 
One of the main problems faced is that the Preprocessing Voice Signal system 
design could not be implemented into the chosen development tools (Xstend Board 
Vl.3.2) as proposed earlier because it has been damaged due to an unfortunate mistak . 
A new board has yet to be bought because of its high cost and the process of ordering 
takes some times. In order to prevent this issue from happ ning in future, students has 
been advised to use the development tools with extra carefulness. 
System implementation marks the lack of mastery in both modeling Preprocessing 
Voice Signal system using VHDL description and the Xilinx's Webl' CK lSI~ software. 
Lack of mastery in VHDL programming language and the nc' en ironmen t of the Modul 
Sim XE simulation tools create errors durin compilation of module» in the 










Voice Recognition System Evaluation 
compilation errors. This causes a suspending simulation. Simulation strategy in which is 
to compile the modules from the lowest level subsystem helps to reduce much more 
complex errors that may be produced during higher level subsystem's ofVHDf, 
description. Besides that, there is only one operation of each module can he tested at one 
time when using Xilinx's WebPACK ISE compiler. This approach risks in the time taken 
to build each successful subsystem module. The more time spent on each of the 
subsystem module, the longer it takes to complete the whole Preprocessing Voice Signal 
system. 
Another problem is that, each successfully compiled module does not mean that 
the module functions correctly as expected. It is necessary to look for logical en-ors when 
there is a faulty value shown in the output waveform generated by the compiler. 
7.3 System Strengths 
Preprocessing Voice Signal system highlights the use f 'odec AD 'in XStcnd 
Board. It is highly integrated with high performance to provide stereo analog-to-digital 
converters using delta-sigma conversion technique. Applications include reverb 
processors, musical instruments, DAT and multi rack recorders. Thus it is a very suitable 
device to be used in implementing the Preprocessing Voice Signal system. 
This system also provides an easy use of voice recognition in preprocessing voice 
signal by using VI-IDL as the hardware programming language. Thu coding is simple and 
easy to understand even to a first time user. By using VIIDL, this desi n can be used 










Voice Recognition System Evaluation 
designed to be device-independent with behavioral simulation that permits design 
optimization. 
7.4 System Constraints 
One of the Preprocessing Voice Signal system drawbacks is the overrun error that 
may occur. As already mentioned in Chapter 4, an overrun error occurs when new data 
arrives when the receive buffer is full (there is no place to put the new data). The data 
that caused the overrun condition to be detected is lost and the last good data character 
that was received is flagged with the overrun error. It indicates whether new data sent in 
is overwriting the previous data received that has not been read out yet. The potential for 
data overruns, however, is always present. Data overruns must be guarded against since 
the overrun error condition can only be detected after one or more data characters have 
already been lost. The parity error, framing error, and overrun error indicate any 
problems with the current received data. 
7.5 Future Enhancements 
This section focuses on improvements that can be made to the existing 
Preprocessing Voice Signal system. The main enhancements that can v ill done is to 
program the chosen development board with the produced behavioral coding so as to 
provide a clearly view of how the system actually works. This can be done as soon as tho 









Voice Recognition System Evaluation 
·'·'······················································································· ·················································································· 
Other than that, focus has been put on the displaying part, where 'vocabulary' and 
'sentences' will be implemented in voice recognition to provide wide usage of 
Preprocessing Voice Signal system, instead only on numbers. 
Another improvement that can be made is to make sure the Preprocessing Voice 
Signal system may receive input from noisy environment as wel1, by implementing 
filters. 
7.6 Knowledge and Experience Gained 
During the four consecutives months of developing this system, exposure of what 
it needs to develop a Preprocessing Voice SiE:,JUal system from scratch has been get. The 
related terms to the topic given has been searched after getting the topic on the first day. 
Basic concepts on terms such as Analog-to-Digital converter, Serial and Parallel data 
Shift Register, Clock divider and many more were studied to provide a better 
understanding in developing this system. Discussion sessions with Supervisor were 
attended as frequent as possible to discuss issues related to the topic. A design proposal 
was produced and presented to the Moderator. This continues with the design 
implementation using VHDL programming language and Xilinx's WebPA KISE 
simulation tools. This continues with the testing of the system for its functionality and 
verification. Again, a successful system was presented lo the Moderator. The presentation 
marks the last part of the system development process. It is noticed Ul<tl during this 











Voice Recognition Conclusion 
Conclusion 
Preprocessing Voice Signal system has an advantage of using VHDJ, a8 rho 
programming language because of its portability, standardized, productivity, re usability 
and lower cost of implementation benefits. 1broughout this thesis, tho first four chapters 
have been presented on the design issues and review on existing designs. ,haptcr 5 
explains in detail of how the system is being implemented with VIIDL behavioral and 
structurally description. Chapter 6 presents the simulation resulted from the 
Preprocessing Voice Signal system obtained in the form of waveform output. Positive 
results shown in the waveform output verifies that the system is successful. Having 
simulated and :functionally verifies, it has been proven in this thesis that the 












Svstern Codinz ... -- ..• ~j 
:Uescriptitm/ File name: CLKG:E'N.VI-m 
Lll'RARYIEEE; 
{;;}E IEFE::td..Ji•f:;ic). JM ALL.; 
FSE IEEE. 1:td _.I o;r,ic ... 1m;;igne• tit LL; 
F~.rtTfY i:tk_gen [S 
GENEBJC 
CHAt,['.Tf:L_DTill.ATIOW: positive :"' 17.8 ·- musr hi· Jl:! 
I; 
PORT 
-· interface :r;o sienals 
elk: J~,J :~:d .. k·gic;-~- dock input 
reset: :u': <:td..Jogic; -- synchronous active-hi sh H·~et 
... codec chip clock ::igrmls 
sclk: OlJT ~r.d_)ogi<·; -vscriel data clock to codec 
lrck: 01.:1' nt.d)o>.~c·. •· Je·O,'ri1ih! C(nkc channel select 
bit __ cntr: OFI' ;;t1l._logii:,_ "<dor(:1 DO\\iNJ'O ti); 
:~ibcyd;,,_p1(r: OtT 11tc.1.)ogk_ v"dor( I DD\Vi'lTO O} 
); 
EJ\!Tl clkgen; 
AF.Cl:IJTEC'ITRE: cll<!;t~ti-~tch OF cJJ.:geu IS 
COl~IJTAl\Tyrn: :rm_r.c•GfC :"''I'; 
C(!NSTANT no: ~!Tl),. I. OGIC :"' '(!'; 
CON:::T.<\:NT roady: ~lTD_! OOIC :" 'l'; 
COt-lSTANT O\'t.111.Ut: f.;TD,,_L()GIC '.'" 'l'; 
CONSTPJ'l'f ldt: srn _I.0Ci1C ;n• 'tY; 
C(. i'lffr.u,f~T right: FI'D ... LO (HC ,,, 'I'; 
SIGNAL lrck,)r1I: 11td)ogi1;; 
;'J[GN,\ L $"1([: nld _logic .. y.:-d•ir('i IJ\!W1'l'J"O 0): 
BEGIN 
Ir.NJ dock: 
PHOCE:'.W·(dk,;rc<j,lrd\ .. int) 
BErnN 
IF (clk'ovent ANTl clk=' l') THEN 
IF(resw~'iTS) T[{EN ... ii~nchronon:: reset 
:;r:·q ·<:>~ 1.unn~r..;;: .. 1;•·'0'); 
lrck __ inr <~ lefl; •• ::tarl with Jdt channel of codec 
EI..~lll'(ucr1"CHANNEL_DUl~ATlON·ll Tl ~!'.; 
seq ,,,~ 1,0THJ::(\.":"~•'(l'); -·· n·Hel ~cqu1·11(t:r 1·v1:1y 1;hm111d period 
lrd;._int <"· NOT(ln:k,_inl); •• toggle channel sel c\ ti:)' ptt i1)d 
E.LSE 
~~q ·~"' S•!'] i- J.; 




Ird: <~· l"k__iut, .... ouipur. lhe duinne-l N••kc1(>r lo 1lu· C(•1h·<: 
ia:H~"'" ,t<1(1 ); •• 1:ori11J driia ,-hilt dork• lflJOI~ i111 •Ill <;)(H.:k 
bi1..cnlr <•• ~wq(; DO VNTO 7.): 
ij!<hr.ycle .. •.nh· <"'!:CO:{( I. Df1 WJ';To 0); 









Voice Recognition Appendices 
·'·'·'·'·'•'·'•'•'•'•'·'·'·'·'·'·'·'·'•'·'•'•'·'·'·'·'•'•'·'·'·'·'·'•'·'·'·'·'····'·'·'•'•'·'•''·'·'·'·'·'·'-'·'·'•'·'·'·'·'·''·'-'·'·'·'•'·'·'•'•'·'•'•'•'•'•'•'•'•'·'•'•'•'•'•'·'· '•'•'•'•'•'·'•'·'·'· 
Description/ File name: TSTCLKGEN.VHD 
LIBRARY ieee; 
USE ieee.std_logic_l 164.ALL; 
USE ieee.numeric jstd.Al.L; 
ENTITY testbench IS 
E1\1D testbench; 
ARCHITECTURE behavior OF testbench IS 
COMPONENT clkgen 
PORT( 
elk IN std_logic; •• clock input 
reset: IN stdIogic; -- synchronous active-high reset 
sclk: our stdIogic; 
lrck: OUT std_logic; -- left/right codec channel select 
bit cntr: O{JT stdIogic , vector(5 DOWNTO O); 
subcycle_cntr: OUTstd_logic_vector(l DOWNfO 0) 
); 
END COMPONENT; 
SIGNAL elk: stdIogic; 
SIGNAL reset : std_logic; 
SIGNAL sclk: stdIogic; 
SWNAL lrck: std logic; 
SIGNAL bit_cutr ~ std_logic_ veC'tor(5 downto O); 
SIGNAL subcycle , cntr : sfd_logk_ vector(l dow1110 O); 
Signal Clock_cycle: natural := O; 
BEGIN 
uut: clkgen PORT MAP( 
clk=> elk, 
reset => reset, 
sci k=e-sclk, 
lrck=e-lrck, 
bit cntr .-:> bit cntr, 




Clock cycle <so Clock_cycle + 1; 
Clk.._.._,,,1'; 
wait for 41.5 us; 
Clk<='O'; 
wait for 41.5 ns; 
end process; 
-· ~+• Test Bench - User Defined Section "'""" 
tb : PROCESS 
BEGIN 
reser-e= '1 '; 




•• •u End Test Bench· User Defined Section "'"'"' 
END· . 










Voice Recognition Appendices 
USE IEEE.std_logic_l 164.ALL; 
USE IEEE.std_logic _unsigned.ALL; 
ENTITY channel IS 
GENERIC 
( 




-- interface 1/0 signals 
elk: IN std_logic; -- clock input 
reset: IN std_logic; -- synchronous active-high reset 
chan_on: IN std_logic; 
bit_cntr: IN std_logic_ vector(5 DOWNTO O); 
subcyclecntr: IN std_logic_ vector(l DOWNTO O); 
chan jsel: IN std_logic; -- select L'R channel for read/write 
rd: IN std_logic; -- read from the codec ADC 
sdout: IN stdIogic; -- serial input from codec ADC 
adc_out: OUTstd_logic_vector(ADC_ WIDTH-1 DOWNTO O);--ADC output 
adc_out_rdy: OUT std_logic; -- ADC output is ready to be read 
adcoverrun: OUT std_logic -- ADC overwritten before being read 
); 
END channel; 
ARCHITECTURE channel_arch OF channel IS 
CONSTANT yes: SID_LOGIC r= 'l'; 
CONSTANT no: srn _LOGIC := '0'; 
CONSTANT ready: STD_LOGIC :•'l'; 
CONSTANT overrun: STD_LOGIC :- 'l'; 
CONSTANT left: STD LOGIC r= '0'; 
CONSTANT right: STD_LOGIC r= '1'; 
SIGNAL adc_shfreg: st.d_logic_ vector(ADC_ WIDTI1-1 I)OWNTO O); 
SIGNAL adc_full: std_logic; ··ADC shift register is full 
SIGNAL adc_rd: std_logic; -- the ADC channel has been read 
SIGNAJ, adc_rd_nx:t: std_logic; ··the ADC channel has been read 
SIGNAL adc_out_rdy_int: std_logic; » interual version adc_out_rdy 
BEGIN 
-- receives data from codec ADC 
rev adc: 
PROCESS( c lk.chan , on.subcycle _ cntr,bit_ cntr.adc _ shfre g.sdou t) 
BEGIN 
IF(clk'event AND (cl YES)) TilEN 
IF(reset~'l') THEN 
adc_shfreg <= (OTHERS'"'>'O'); 
adc_full <=NO; 
ELSIF((chan_on=YES) AND (snbcycle_cnt.r"r2)) THEN 
IF(bit_cnt.r--ADC_ WIDTH-1) THEN 
adc full <= NO; 
adc=shfreg <= adc_shfreg(ADC_ WIDTH-2 DOWNTO 0) & sdout; 
ELSIF(bit_cnt ADC_ WIDTI!·l) THEN 
adc fall <"' YES· 





adc_out <-"'" odc_ahfrcg; 
-· handle reading of ADC data from codec i11lcrfo c 
adc rd nxt c.. YES WHEN (udc full ES AND chun iicl 11'.S ANI) rd Yh<t) Olt 
(nd~_li1ll YP.S AND adc_rd V.S) 
ELSE NO; 
rend ode: 









Voice Recognition Appendices 
BEGIN 
JF(clk'event AND clk='l') THEN 
lF(reset=YES) THEN 
adc _rd <=NO; 
ELSE 




-- ADC data is ready if register is foll and hasn't been read yet 
adc_out_rdy_int<= YES WHEN adc_full=YES AND adc_rd=NO ELSE NO; 
adc_out_rdy <= adc_out_rdy_int; 
-- detect and signal overwriting of data from the codec ADC channels 
detect adc overrun: 
PROCESS(clk,reset,bit_ cntr.chan _ on.adc _ out_rdy_ int) 
BEGIN 
JF(clk'event AND clk='l') THEN 
IF(reset=YES) THEN 
adc _overrun <=NO; 






Descriptioo/ File name: TSTCHANNEL_RCVADC.VlID 
LIBRARY ieee; 
USE ieec.std_logic_1164.ALL; 
USE ieee.numeric __ std.ALL; 
ENTITY testbench IS 
END testbench; 
ARCHITECTURE behavior OF testbench 1S 
COMPONENT channel 
J>ORT( 
elk: IN s1d_logic; 
reset : IN std_logic; 
chan_on: IN std_logk; 
bil_mtr: IN 8td_logic_ veclor(5 downto O); 
subcycle_cntr: IN std_logic_ vector(l downto O); 
chan_sel : IN std_logic: 
rd : IN std_logic; 
sdout : IN std_logic; 
adc_out: OUTstd_logic_vcctor(l9 downto O); 
adc_out_rdy: OUT std_logic; 
adc_overrun: OUT std_logic 
); 
END COMPONENT; 
SIGNAL elk: std_logic; 
SIGNAL reset : std__logic; 
SIGNAL chanon : std_logic; 
SIGNAL bit_cntr: std_logic_ vtctor(.5 downto O); 
SIGNAL subcycle _ cntr : std_logic _vector( I downto O); 
SIGNAL chun_sel : std_logic; 
SlGNAL rd : std_logic; 
SIGNAL adc_out: std logic_ vcctor(l9 downto O); 
SIGNAL 1tdc_out_rdy: ijtd_logic; 
SIGNAL ndc _ overrun : std _logic; 
SJGNAL Ado11t : std logic; 











Voice Recognition Appendices 
·'·'·'·'·'·'·'•'•'•'•'•'·'•'·'•'•'·'·································································································-························································································'·'·'·'·'·'•'•'·'·'·'·'·'•' 
uut: channel PORT MAP( 
elk=> elk, 
reset => reset, 
chan_on => chancn, 
bit_cntr => bircntr, 
subcycle jcntr => subcyclecntr, 
chan_sel => chan jsel, 
rd=> rd, 
adc_oul ='.> adcout, 
adc_out_rdy => adc_out_rdy, 
adc _ ovemm => adc _ ovemm, 




Clock cycle <-""' Clock cycle + I; 
Clk<='l'; 
wait for 41.5 ns; 
Clk<='O'; 
wait for 41. 5 ns; 
end process; 
-- *** Test Bench - User Defined Section "'""" 
tb: PROCESS --: rcv adc (receive data from ADC) 
BEGIN 
--bit_cntr =O 
reset c= 'L'; 
chan_ on<*'' l '; 
bit_cntr< '000000"; 




wait for 100 ns; 








wait for 100 us; 
--bit_cnt:r = 2 
reset <"' '0'; 
chan_on<,..'l '; 
bit cntr<"'"OOOO 10"; 
sub~ycle _cntr<="10"; 
chan; sel<l"<'l '; 
rd<='O'; 
sdout'< I'; 
wait for 100 ns; 
··bit __ cntr= 3 
cbru1_ 011< 'l '; 
bit_ en! '"'"000011"; 
sub,·ycle_c11l1'< "10"; 
chrm_ ~cl< .... '0'; 
rd '1'; 
Adout· 'J '; 












·'·'·'·'·'·'·'·'··················· ····'·'·'-'·'·'·'·'·'·'····································································································· ··········"·'•'·'·'·'·'·'•'·'·'•'•'•'•'•'•'·'·'·'·'·'······························'·'·'·'·'·'·'·'·'·'•' 
Appendices 
--bit_ cntr =5 
--bit_ cntr =6 
--bit_ cntr =7 
··bit_ cntr r.8 
·-bit_ontr 
--bit __ cntr •10 











chan __ sel<='O'; 
rd=='I'; 
sdout<='l'; 
wait for 100 ns; 
chan_on<='l'; 
bit_cntr<"""000110"; 




wait for 100 us; 
chan_ on<='l'; 
bit_ cntr="OOOll 1"; 
snbcycle _ cntr<="lO"; 
chan _ scl-e=O'; 
rd<='l'; 
sdout-c='J'; 
wait for 100 ns; 
chan_on< 'l'; 
bit , cntr::-''001000"; 
subcycle_cntr<:""'l O"; 
chan , sel<'-''0'; 
f( "s.'l'; 
sdoul"" 'l'; 
wait for 100 ns; 
chan __ on~'l'; 
bit_ctitr<"'"001001 "; 
subcycle_cntr<-· 'l 0"; 
chan_scl<='O'; 
rd-e=').'; 
sdout-e=' l '; 
wait for 100 ns; 
chun_on<:=-'l'; 
bit_cntr<-"001010"; 




wail for 100 ns; 
chan_on<- l '; 




~donl• '1 '; 











--bit_ cntr = 12 










chan _ sel<='O'; 
rd=='J'; 
sdout<='l'; 




chan sel<"'10'· - ' 
rd<='l'; 
sdout<='l'; 
wait for 100 us; 
chan_on<"''l '; 





wait for 100 ns; 
chan __ on-e=' 1 '; 
bit_cnf:t"<'.="001111"; 




wait for 100 ns; 
chnn_on<'"''l'; 
bit_cntr<~"Ol0000"; 
subcycle _ cntr-c , "J 0"; 
chan_sel<='O'; 
rd«-<'0'; 
sdout==' l '; 
wait for 100 ns; 
cha.n_oo l'; 
bit_cntr< 'OlOOOl"; 













•• ""'"'' End Test Bench • Ufior J:>di11l'cl Secfio11 "'"'" 
END; 










Voice Recognition Appendices 
LIBRARY ieee; 
USE iete.std_logic_1164.ALL; 
USE ieee.nurneric _std.ALL; 
E'NTffY testbench IS 
END testbench; 
ARCHITECTURE behavior OF testbench IS 
COMPONENT channel 
PORT( 
elk: IN std_logic; 
reset : IN std _ _logic; 
chan_on: IN stdIogic; 
bit_cntr: IN std_logic_ vcctor(5 downto O); 
subcyclecntr : IN std_logic_vector(l downto O); 
chansel : IN 8td_logic; 
rd: IN std_logic; 
sdout : IN std_logic; 





SIGNAL elk : std Iogic; 
SIGNAL reset: std_logic; 
SIGNAL chan_on: st.d_,logic; 
SIGNAL bit_cntr: std_logic_vector(.5 downto O); 
SIGNAL subcycle_cntr: std_logic_vector(l downto O); 
SIGNAL chan sel : std_logic; 
SIGNAL rd : std_logic; 
SIGNAL adc_out: sf'd_logic_vector(19 downto O); 
SIGNAL adc_out_rdy: std_logic; 
SIGNAL adc_ovcrrun: std_logic; 
SIGNAL sdout : std_logic; 
SIGNAL Clock_cyclc: natural :-= O; 
BEGIN 
uut: channel PORT MAP( 
clk=> elk, 
reset "'> reset, 
chan_on...;:,. chan_on, 
bit cntr =e-bit cntr, 
sub~ycle_cntr :> subcyclejcntr, 
chan_scl => chan_sel, 
rd .. > rd, 
adc_ont """> tt.dc_out, 
adc_out_rdy •> ndc_out_rdy, 
adc_ovemm "'> udc_ovcmm, 




Clock cycle '-"" Clock_cycle + 1; 
Clk-c= 'l'; 
wnit for 41...5 ns: 
Clk"°""'O'; 
wait for 41.5 ns; 
end process; 
-- "'"'"'Test Bench - User Defined Section •O 












--bit_ cntr = 1 
--bit_cntr = 2 
--bit_ cntr = 3 
·-bit_cotr 
--bit_ cntr =5 










wait for 100 ns; 
reset<= '0'; 
chan_on<='l'; 





wait for 100 ns; 
reset <= '0'; 
chan_on<='l'; 
bit_ cnlr<"'"000010"; 
subcycl e _ cntr<='' 10"; 
cban_ se \<='0'; 
rd<='O'; 
sdout<='l'; 




chan _ sel<='O'; 
rd<"''l'; 
sdout , 'l'; 
wait for 100 11s; 
chau_ort"-'""'l'; 
bit_ cntl'<'='"000100"; 




wait for 100 ns; 
chan_on<='l'; 
bit_cn '000101"; 




wait for 100 ns; 
chan_ on<"-'I '; 
bit_ cnlr<""''OOOl l 0"; 
subcyclc_rnt .. 1101"; 
chnu_&el"""'O'; 
rd< '1'; 
sdour-c=' 1 '; 
wuil for lOO ns; 
cht111_011 '0'; 
bil_.<'1111" "00011 t "; 











--bit __ cntr =8 
--bit_cntr =9 
--bit_ cntr =10 
--bit_cnlr -=11 
-·bit_ cntr .. 12 
--bit_ cntr =13 
--bit_cntr -14 
··bit_cnlr .. lS 
Appendices 
chan , se l<='O'; 
rd<='l'; 
sdout<='l'; 
wait for 100 ns; 
chan_on<='l'; 
bit_cntr<="OOlOOO"; 
subcycle _ cnu-e="l 0"; 
chan _ sel<='O'; 
rd'<='I'; 
sdout-c=O'; 
wait for 100 ns; 
chan_on<='l'; 
bit_cntr<="001001"; 




wait for 100 ns; 
chan_on<='l'; 
bit_ cntr<="OO l 010"; 




wait for 100 ns; 
cha11_ cn-c=' l '; 
bit cntr-.-''001011"· - ' subcycle , cntr<="lO"; 
chan_sel<=!O'; 
rd<""'l'; 
sdout-e=' l '; 
wait for 100 ns; 
chan_on¢''1'; 
bit en ·"001101"· - ' subcycle , cntr<"'"lO"; 
chan _ sel<0=101; 
rd<='l'; 
sdout-c=O'; 





subcycle __ cntr<*""lO"; 
chan , sel<='O'; 
rd=='I'; 
sdout<""'l '; 
wait for I 00 ns; 
chan_ on<'""'O'; 
blt_.cnl "001110"; 




wnit for l 00 118; 
chun __ 011· 'l '; 










Voice Recognition Appendices 
subcycle _ cntr<~~·10"; 
chan , sel<='O'; 
rd'<=T'; 
sdout<='l'; 
wait for 100 ns; 
--bit_cntr =16 
chan , on=>' l '; 
bit_ cntr<="Ol 0000"; 
subcycle _ cntr<="OO"; 
chan _ sel==O'; 
rd<='O'; 
sdout<='l'; 
wait for 100 ns; 
--bit_ cntr =17 
chanonc='I'; 
bit_ cntr<=''010001"; 
subcyclc _ cntr<="lO"; 
chan __ sel<='O'; 
rd=='I'; 
sdout<='l'; 
wait for 100 ns; 









-·"'"'"'End Test Bench • User Defined Section """"' 
END; 
Description/ File name: TSTCHANNE'L_READOU'f .VHD 
LIBRARY ieee; 
USE ieee.std_logic_l 164.ALL; 
USE ieee.nurneric jnd.Al.L; 
ENTITY testbench IS 
END testbcnch; 
ARCHITECTURE behavior OF tcstbench IS 
COMPONENT channel 
PORT( 
elk : JN std_logic; 
reset: JN std_logic; 
chan_on: JN std_logic; 
bit__cntr : IN std_logic_ vector(5 downto O); 
subcyclc_crllr: IN std_logic_ vectorf l downto O); 
chan_scl : IN std_logic; 
rd : IN std_logic; 
sdout : IN std_logic; 
ndc_oul: OUT std_logic_vcctor(l9 downio O); 
adc_out_rdy: OUT std_Jogic: 
1!.dc_overnm: OUT sld_logic 
); 
END COMI'ONliNT; 
SIGNAL elk: sld_lo11ic; 
SlGNAL reset : ~ld_logit:; 










Voice Recognition Appendices 
SIGNALbit_cntr: std_logic_vector(5 downto O); 
SIGNAL subcycle_cntr: std_logic_ vector(l downto O); 
SIGNAL chansel : stdlogic; 
SIGNAL rd : std _logic; 
SIGNAL adc_out: std__logic_vector(l9 downto O); 
SIGNAL adc_out_rdy: std_logic; 
SIGNAL adccverrun : std_logic; 
SIGNAL sdout: std_logic; 
Signal Clock_cycle: natural := O; 
BEGIN 
uut: channel PORT MAP( 
clk=>clk, 
reset => reset, 
chan_on => chanon, 
bit_cntr => bit_cntr, 
subcycle _ cntr => subcycle _ cntr, 
chansel => chan jsel, 
rd=> rd, 
adc_out => adc_out, 
adc _.out __ rdy => adc __ outrdy, 
adc_ovemm => adc_ovcrrun, 




Clock cycle <= Clock cycle + 1; 
Clk<='l'; 
wait for 4U ns; 
Clk<='O'; 
wait for 41.5 ns; 
end process; 
-- "'"'"' Test Bench - User Defined Section "'"';' 
tb: PROCESS 
BEGIN 
-- adc is ful I and read process wi 11 only be permitted 
-vwhen bit counter » t9, (shift register is foll) and 
-- and has not been read 
--bit_cntr «18 
reset<= 'l'; 





sdout-e '1 '; 
wait for 100 ns; 
-·bit_ cntr =-] 9 
reset <1'.t '0'; 
chan_on"-""''l'; 
bit_cn 'OJOOJ I"; 




wait for 100 ns; 
hnn ow· 'l'; 
bit cnfJ'<. "01001 l''; 
















-- ***End Test Bench - User Defined Section"'*" 
END; 




E1'1'fffY testbench IS 
E"ND testbench; 
AR.CfllTECTURE behavior OF testbench IS 
COMPONENT channel 
PORT( 
elk: IN std_logic; 
reset : JN std _logic; 
chan_on: IN std_Iogic; 
bit_cntr: IN std_logic_vector(5 downto O); 
subcycle jcntr : IN std_logic_ vector(l downto O); 
chan jsel : IN i;td_logic; 
rd: IN std_logic; 
sdout: IN std_logic; 
adc_out: OUT std_logic_ vector(l9 downto O); 
ado_,out_.rdy: OUT std__logic; 
adc_ovenun: OUT std_logic 
); 
END COMPONENT; 
SIGNAL elk; std_logic; 
SIGNAL reset : std_logic; 
SIGNAL cban_on: std_logic; 
SlGNAL bit_ontr: std_)ogic._vcictor(S downto O); 
SIGNAL subcycle_cntr: std_logic_vecfor(I downro O); 
SIGNAL chan_sel : std_logic; 
SIGNAL rd : std_logic; 
SIGNAL ack_out: ~td_logic_veclor(l9 downto O); 
SIGNAL adc_out_rdy: std_logic; 
SIGNAL adc_ovemm: sld_logic; 
SIGNAL sdout : std_logic; 
Signal Clock __ cycle: natural r= O; 
BEGIN 
uut: channel ]'ORT MAP( 
clk =e- elk, 
reset -.. reset, 
ch.an_on m> chan_on, 
bit cntr=> bit cntr, 
subeycle __ cntr ;:;> subcycle_cntr, 
chan_sel <=:> chan_sel, 
rd~rd, 
ndo_out '"'> ndc_out, 
udc_ont_rdy => adc_out_rdy, 
11dc_ovc1TU11 .. > ndc_ovcmm, 




'lock_ ycle .,, Clot<k_cyclc I l; 
Clk '1'; 









Voice Recognition Appendices 
Clk<='O'; 
wait for 41.5 ns; 
end process; 
-- *** Test Bench - User Defined Section *** 
tb : PROCESS 
BEGIN 
-- adc is full and read process will only be permitted 
--when bit counter= 19, (shift register is full) and 





subcycle , cntr-c=vl O": 
chan __ sel<='l'; 
rd<='O'; 
sdout<='l'; 
wait for l 00 ns; 
--bit_cntr =19 
reset-c= '0'; 
chan __ on<=''l'; 
hit_cntr<="OlOOll"; 




wait for 100 ns; 
·- ADC data is ready if'register is full and hes not been 
-· read yet, where channel select and read.rd "'-0 
chan_on<-'1'; 
bit_ cnl.r<"'"01001 l "; 




wait for 100 ns ; 
cban_on<='l'; 
bit_cn ''010011"; 




wait for 100 ns ; 
chan_on<"'-'0'; 
bit_ CllU"-""'"010011 "; 
subcycle _ en "'''I 0"; 
chun_scl<' '0'; 
nl<""'l '; 
sdout<"'' l '; 
wait for 100 ns ; 
chru1_ on<'"' J '; 
bit_cn "0l0011"; 
subcycle _ cntr- "I 0"; 
chan .N<1f·· 'l '; 
rd 'I'; 
Ndout· 'I'; 










Voice Recognition Appendices 
cban on-c="I'; 
bit_ cntr<="O 10011"; 






• *"'*End Test Bench - User Defined Section **" 
Description/ File name: TSTCHANNEL_OVERRUN.VHD 
LIBRARY ieee; 
USE ieee.std_logic_l164.ALL; 
USE ieee.numeric jstd.Al.L; 
ENTITY testbench IS 
END testbench; 
ARCHITECTURE behavior OF testbench IS 
COMPONENT channel 
PORT( 
elk: IN stdIogic; 
reset : lN std _logic; 
chanon : lN std_logic; 
bit_cntr: IN std_logic_ vector(5 downto O); 
subcyclejcntr : IN std_logic_vcctor(l downto O); 
chan __ sel : IN std__logic; 
rd : IN std_logic; 
sdout : IN std_logic; 
adc_out: OUTstd_logic_vcctor(l9 downto O); 
adc_ont_rdy: otrr std_logic; 
11dc_overrw1: OUT std_logic 
); 
END COMPONENT; 
SIGNAL elk: srd_logic; 
SIGNAL reset : std_logic; 
SIGNAL chan_on: std_logic; 
SIGNAL bil_cnlr: std_logic_vcctor(S downto O); 
SIGNAL subcycle jcntr : st<l_logic_ ectorf l downto O); 
SIGNAL chan jsel : std_logic; 
SIGNAL rd : std_logic; 
SIGNAL adc_out: std__logic_ veclor(l9 downto O); 
SIGNAL adc_out_rdy: std_logic; 
SIGNAJ, adc_ovcrrun: sld_logic; 
SIGNAL sdout : std_logic; 
Signal Clock_cyc.le: 1111l11ml :"' O; 
BEGIN 
nut: channel PORT MAP( 
clk=> elk, 
reset a> reset, 
ehan on=> chan_on, 
bit_ cntr "">bit_ cntr, 
subcycle , cntr ""> subcyclc_cntr, 
¢J11u1_sel ""> chnu_sel, 
rd=> rd, 
R1lc_out ""> aclc_oul, 
11dc~.011t .r<ly > ndc.~ont.nly, 
ndc overrun ,. udc overrun, 












Voice Recognition Appendices 
begin 
Clock_cycle <= Clock_cycle + I; 
Clk<='l'; 
wait for 41.5 ns; 
Clk<='O'; 
wait for 41.5 ns; 
end process; 
-- *** Test Bench - User Defined Section *** 
tb : PROCESS 
BEGIN 
--to test overrun (when adc is full and the old contents 
--have not been read out) 
=when bit counter=chanon =1 
reset-c= 'l '; 
chan __ on=='!'; 
bit_cnir<='"OlOOll"; --adc is full when bit counter=Is 
subcycle_cntr<="lO"; 
chan _ sel<='O'; 
rd<--"'0'; 
sdout<=' 1 '; 
wait for 100 ns; 
reset <= '0'; 
chancn=='!'; 





wait for 100 ns ; 
chan _ ono=' l '; 
bit_cn "000001"; 
subcycle , cntr<•"l 0"; 
chnn_sel<-'0'; 
rd<='O'; 
sdout-e=' l '; 






S(Jout< ... ' l '; 
wait; 
r:ND PROCESS; 
-· ll'oi<"' End Test Bench - User Defined Seel ion .,..., 
END; 















Voice Recognition Appendices 
ARCHITECTURE behavior OF testbench IS 
COMPONENT channel 
PORT( 
elk: 1N std _ _logic; 
reset : 1N std _logic; 
chan_on: IN std_logic; 
bit_cntr: 1N std_logic_ vector(.5 downto O); 
subcycle jcntr: TN std_logic_vi:ctor(l downro O); 
chan_sel: 1N std Iogic; 
rd : JN std_logic; 
sdout: 1N std_logic; 
adcout : OUT stdIogic _ _vector(l9 downto O); 
adc_out_rdy: OUT std_logic; 
adc_overrun: OUT std_logic 
); 
END COMPONENT; 
SIGNAL elk: stdJogic; 
SIGNAL reset: std Iogic: 
SIGNAL chan __ on: std_logic; 
SIGNAL bit_ cntr : std_logic_ vector(5 downto O); 
SIGNAL subcycle_cntr: std_logic_vector(l downto O); 
SIGNAL chan_sel: std_logic; 
SIGNAL rd: std_logic; 
SIGNAL adc_out: std_logic_ vtctor(l9 downto O); 
SIGNAL adc _ out_rdy : std _logic; 
SIGNAL adc_overrun: std_logic; 
SIGNAL sdout: stdIogic; 
Signal Clock_cycle: narural := O; 
BEGIN 
uut: channel PORT MAP( 
elk 'O'.> elk, 
reset ..-.,,. reset, 
chan on m> chan on, 
bit_ c-;;tr .-.-:- bit_ctrtr, 
subcycle _ cntr a> subcyclc _ cntr, 
chan __ sel ""> chan_sel, 
rd ... >rd, 
adc _out ~· II.de_ out, 
adc_out_rdy""> r1dc_o11t_rdy, 
n.dc_overrun "'> 1r.dc_ovemm, 




Clock cycle <= Clock_cycle + 1; 
Clk<='l'; 




-- "''"' T~ t Bench - User Defined Section ~·~" 
tb : PROCESS 
BEGIN 
--to test overrun (when adc is full and the old contents 
=have not been rend out) 
=when bit countc cha.n_on 
reset < 'l '; 
chan_on 'I'; 
bl! cnt "Ol 0011 "; ··mlc i8 foll when hii oontt I !I 













wait for 100 ns; 
reset <= '0'; 
chan_on<='l'; 
bit_cnt:r<="OlOOl "; 




wait for 100 ns ; 
chan_on<='l'; 
bit_ cntr<="O 10011 ": 




wait for 100 ns ; 
chan_on<""'l'; 
bit_cntr<="OOOOOl"; --overrun happen when hit cntr ==I 
subcycle _ cntr<="lO"; 
chan _ sel<='O'; 
rd<='O'; 
sdout<-1'; 
wait for 100 ns; 
chan_ on<""JO'; 
bit_ cntl'*""000001 ": 
subcycle _ cutr<-"10"; 
chan_ se l<--'0'; 
rd<"''!'; 
sdoul<=-'0'; 
wait for 100 ns; 






wait for 100 ns; 
chan _on<"<=' l '; 
bit_cntr<'='"00001 l"; 




wait for 100 ns; 
chan_on<""'l '; 
bit_cntr<'--"000111"; 
suhcycle_cnt .. "10"; 




ENI) PI OCESS; 











Voice Recognition Appendices 
Description/ File name: CODEC_INTFC.VHD 
LIBRARY IEEE; 
USE IEEE.std_logic_l 164.ALL; 
USE IEEE. std_ logic_ unsigned.Al.L; 
ENTITY codec_intfc IS 
GENERIC 
( 
ADC_ WIDTH: positive := 20; 




-- interface I/O signals 
elk: IN stdIogic; -- clock input 
reset: IN std_logic; -- synchronous active-high reset 
lrsel: IN std_logic; -- select L/R channel for read 
rd: IN std_logic; -- read from the codec ADC 
ladc_out: OUTstd_logic_vector(ADC_ WIDTH-l DOWNTO O); -- LADC 
radc_out: OUT std_logic_vtctor(ADC_ WIDTII-1 DOWNTO O); -- R ADC 
ladcjoutrdy: O'Uf std_logic; -- left ADC output ready to read 
radc_.out__rdy: OUT std)ogic; -- right ADC output ready to read 
adc jrverrun: OUT stdIogic; --ADC overwritten before read 
-- codec chip I/0 signals 
sclk: OUT std_logic; -vserial data clock to codec 
lrck: OUT stdIogic; -- left/right. codec channel select 
sdout: IN std_logic -- serial input from codec ADC 
); 
END codec _intfc; 
ARCHITECTIDIB codec_intfc_arch OF codec jntfc IS 
CONSTANT yes: SID_LOGIC :"' 'l'; 
CONSTANT no: SID_LOGIC :• '0'; 
CONSTANT ready: S'11)_LOGIC :• 'l'; 
CONSTANT overrun: S"ID._LOGIC :"' 'I'; 
CONSTANT left: STD_LOGIC :"' '0'; 








-- interface I/O signals 
elk: IN std logic; -- clock input 
reset: IN std_logic; -- synchronous active-high reset 
-- codec chip clock signals 
sclk: our std_logic; -vseriel data clock to codec 
lrck: OUT stcl_,logic; -- lefi/right codec channel select 
bit_cn1r: OlJf std_Jogic_vector(5 DOWNTO O); 




















Voice Recognition Appendices 
elk: IN std_logic; -- clock input 
reset: IN std_logic; -- synchronous active-high reset 
chan_on: IN std_logic; 
bit_cntr: IN std_logic_vector(5 DOWNTO O); 
subcycle __ cntr: IN std _logic_ vector(l DOWNTO O); 
chan _ sel: IN std_ logic; -- select UR channel for read/write 
rd: IN std _logic; -- read from the codec ADC 
adc_out: OUT std_logic_ vector(ADC_ WIDTH-1 DOWNTO O); --ADC output 
adc_out_rdy: OUT std_logic; -- ADC output is ready to be read 
adc __ overrun: OUT std_logic; -- ADC overwritten before being read 
-- codec chip I/O signals 
sdout: IN stdIogic -- serial input from codec ADC 
); 
END COMPONENT; 
SIGNAL lrck jnt: std_logic; -- internal UR codec channel select 
SIGNAL sclk_int: std_logic; -- internal codec data shift 
SIGNAL bit __ cntr: std_logic_ vector(5 DOWNTO O); 
SIGNAL subcycle __ cntr: std__logic_ vectort l DOWNfO O); 
SIGNAL ladcoverrun: stdIogic; 
SIGNAL radc _overrun: std_ logic; 
SIGNAL lchansel: std_logic; 
SIGNAL rcharisel: std_logic; 
SIGNAL lchan_on: stdIogic; 












Ire lrck jnt, 
bit cntr=c-bit cntr, 
sul;;;ycle __ cnU:.>subcycle __ cntr 
); 
lrck <=not (lrck_int); -- invert for inverter in XStend V l.3 
sclk <= not (sclk_int); 
lchan sel <""YES WHEN lrsd=LEFJ' ELSE NO; 










chan_ on=e-lchan , on, 




adc our=e-ludc out, 
udc _-ouf rcly->~tdc our )'dy, 









Voice Recognition Appendices 
sdout=c-sdout 
); 
rchaa _ sel <= YES WHEN lrsel=RIGHT ELSE NO; 
rchan_on <=YES WHEN lrck_int=RIGHT ELSE NO; 
uright: channel 
GENERIC MAP 






chan , on=c-rchan on, 
bit_cnti=>bit_cntr, 
subcycle _ cntr=>subcycle _ cntr, 
chan sel=e-rchensel, 
rd=c-rd, 
adc _ out=c-radc _out, 
adc _ out rdy=c-radc _ out_rdy, 
adc _ overrun=oradc _ ovemm, 
sdout=c-sdo ut 
); 
adc_ove11111l <=YES WHEN ladc_ove1run=YES OR ntdc_overrun""YES 
ELSE NO; 
END codec_intfc_arch; 
Description/ File name: TSTCODEC_INTFC.VHD 
LIBRARY iece; 
USE ieeti.std_logic_l l64.ALL; 
USE ieee.numeric jstd.Al.L; 
ENTITY testbcnch IS 
END testbench; 
ARCHITECTURE behavior OF testbench lS 
COMPONENT codec_intfC 
PORT( 
-- interface VO signals 
elk: lN std_logic; -- clock input 
reset: lN std_logic; -- synchronous active-high reset 
lrsel: lN std_logic; -- select L/R channel for read 
rd: lN std_logic;_ -- read from tile codec ADC 
ln.dc_oul: OUT std_logic_ vcctor(l9 DOWNTO O); -· L ADC 
radc_out: OUT std_logic_ v1:ctor(l9 DOWNTO O); -- R ADC 
ladc_out_rdy: OUT sl'd_logic; ··left ADC oulp111 ready to read 
radc __ out __ rdy: OUT std_logic; -v right ADC output ready Lo read 
adc_ovemm: OUT std_logic; ··ADC overwritten before rend 
·- codec chip I/O signals 
sclk: our std_logic; 
lrck: OUT std_logic; •• ldl/ri.'VJt codec channel select 
sdout: lN std_logic •• serial input from codec Al) ' 
); 
l~ND C MPONENT; 
SIGNAL elk: std_logic; 










Voice Recognition Appendices 
SIGNAL lrsel: std _logic; -- select UR channel for read 
SIGNAL rd: stdIogic; -- read from the codec ADC 
SIGNAL ladc_out: std_logic_vector(19 DOWNTO O); --L ADC 
SIGNAL radc_out: std_logic_ vector(19 DOWNTO O); -- R ADC 
SIGNAL ladc_out_rdy: stdIogic; -- left ADC output ready to read 
SIGNAL radc_out_rdy: std_logic; ··right ADC output ready to read 
SIGNAL adcoverrun: std_logic; --ADC overwritten before read 
SIGNAL !rck: std_logic; -- lefl/right codec channel select 
SIGNAL sclk: stdIogic; 
SIG"'NAL sdout: std _logic; 
Signal Clock_cycle: natural:= O; 
BEGIN 
uut: codec_ inrfc PORT MAP( 
elk "-""> elk, 
reset => reset, 
lrsel=> lrsel, 
rd=e-rd, 
lade _ out=>ladc _out, 
radc _out ='> radc _out, 
ladc_out_rdy =>ladc_out_rdy, 
radc_out_rdy => radc_out_rdy, 
adc _ overrun=> adc _overrun, 
lrck => lrck, 
sclk=e-sclk, 




Clock cycle <"-"' Clock_cycle +I; 
ctk-e= 'l'; 
wait for 41.5 ns; 
elk <)ft 'O'; 
wait for 41.5 ns; 
end process; 
·- """"'Test Bench· User Defined Section U• 
tb: PROCESS 
BEGIN 
reset <m 'J '; 
rd 0'; -v rd output the value store in 
-·SR connected to AD •••• 




wait for l 00 ns; 
•• ~hilt serinl data from codec AD 






•• "'""" End Test Bench - User Defined Section •O 
END; 
J)cscri1>U<>11/ f1'1Je name: CLOCK_l)JVl.Dlm.VUl> 











-- divide 12MHz -> lHz 
LIBRARY IEEE; 
USE IEEE.STD_ LOGIC_ 1164.all; 
USE IEEE.SID_LOGIC_ARITII.all; 
USE IEEE.STD _LOGIC_ UNSIG1\1ED.all; 











ARCIITl'ECTURE comp OF divider IS 
; JN sm_LOGJC; 
: OUT STD_LOGIC; 
STD_LOGIC; 
: OUT STD_LOGIC; 
: OUT S11) __ LOGIC; 
: OUT STD_LOGIC; 
: OUT STD_LOGIC; 
: OUT STD_LOGIC); 
SIGNAL count_lMhz: STD_LOGIC_ VECTOR(4 DO\llfNTO O); 
SIGNAL count_IOOKhz, count_lOKhz, count_lKhz: STD_LOGIC_ VECTOR(2 DOWNTO O); 
SIGNAL count_lOOhz, countItlhz, countIhz : STD_LOGIC_ VECTOR(2 DOWNTO O); 
SIGNAL clock_lMhz_int, clock_lOOKbz_int, clock_lOKbz_int, clock_lKhz_inl: STD_LOGIC; 




-- Divide by 12 
WAIT UNTIL clock_l2Mliz'EVENT and clock_l2MJ1z .. 'l'; 
IF count_IMhz < 11 THEN 
ELSE 
count_lMhz< count_lMlt~ 1· l; 
count_lMhz <= "00000"; 
END IF; 
1F count_lMhz <7 THEN 
clock_lMhz_i ru <""" '0'; 
ELSE 
clock_lMhz_int <== '1'; 
END IF; 
clock lMhz <=clcck lMhz int; 
clock)OOKhz ..,.,,,. cloZk_lOOKhz_int; 
clock_lOKhz: <"" clock_lOKhz_int; 
clock_lKhz <= clock_lKhz_int; 
clock lOOhz -c= clock lOOhz int; 
clock)Oln <:<> clockjOhz_inl; 
clockLHz ~ onesec __ int; 
END PROCESS; 
-- Divide by 10 
PROCESS 
variable startup : natural; 
BEGIN 
if'startup= 0 then 
count lOOKhz< "000"; 
clock= l OOKhz_int<•'O'; 
startup:•t; 
end if; 
WAD'UNT1Lclock IMhr._int'HV!INTu111ldock IMhr int 'l', 
]['' count I OOJ<hz I ·1 Tlllm 
co11nt_HloKhz co1111t lOOl<hz I l ; 
ELSE 
count I OOkhi: ''000"; 















-- Divide by 10 
PROCESS 
variable startup : natural; 
BEGIN 





WAIT UN11L clock_lOOKhz __ int'EVENT and cloci<;_lOOKhz_int = 'J'; 
IF count_lOKhz /= 4 THEN 
count_lOKhz <= count_lOKhz + 1; 
ELSE 
count_lOkhz <= "000"; 
clock_lOKhz_int <=NOT clock_lOKhz_int; 
END IF; 
END PROCESS; 
-- Divide by 10 
PROCESS 
variable startup : natural; 
BEGIN 
if startup = 0 then 
count_ I Khz<="OOO ": 
clock_IKhz_in · '0'; 
startupr=l ; 
end if; 
WATI' UNrIL clock_lOKhz_int'EVENT and clock_! OKhz_iut • 'J'; 
IF count_lKJiz/=4 THEN 
couut_lKhz <'"-= count_IKhz +I; 
ELSE 
count_lkhz <."= "000"; 
clock_lKhz_i11t <""NOT clock_lK.hz_iul; 
}.!ND IF; 
END PROCESS; 
-- Divide by I 0 
PROCESS 
variable startup : natural; 
BEGIN 
if startup = 0 then 
count __ lOOhz<="OOO"; 
clock_lOOhz_int<='O'; 
startup.= I ; 
end if; 
WAIT UNTIL clock_! Khz_int'EVENT and clock_l Khz_int 'l '; 
IF count_lOOhz /= 4 THEN 
count_lOOhz <= count_lOOhz ·! I; 
ELSE 
count._lOOhz <- "000"; 
clock_lOOhz_int <=NOT clock_lOOhz_int; 
ENDJF; 
END PROCESS; 
•• Divide by 10 
PROCESS 
variable startup : natural; 
BEGIN 





WAfl' UNTIL clo k lOOht int'HVENT oml lock IOOhr i111 'I'; 
IF connf_lOhz I 4 Tlll\N 











Voice Recognition Appendices 
•'•'·'·'•'•'•'·'·'·'·'·'·'•'···························································································································································································· 
count_lOhz <= "000"; 
clock_IOhz_int <=NOT clock_IOhz_int; 
END IF; 
END PROCESS; 
-- Divide by 10 
PROCESS 
variable startup : natural; 
BEGIN 





WAIT UNTIL clock_lOhz_int'EVENT and clock_lOhz_int"' '1'; 
IF count_lhz /= 4 TIIEN 
count_lhz <-"·count_lhz + l; 
ELSE 
count_lhz <= "000"; 




Description/ File name: TSTCLOCK_DIVIDER.VHD 
LffiRAR Y ieee; 
USE ieee.std_logic_l 164.ALL; 
USE ieee.numeric jstd.Al.L; 
ENTITY testbench IS 
END testbench; 






































l:lTD_L GI ; 
STD L ci ; 
S'm_LOGIC; 
S1D_WQ1C; 
srn L 01 '; 
BEGIN 
div: divider PORT MAP( 
clock_! 2mhz "">clock_ I 2111hz, 
clock 1 mhz=> clock I rnht, 
clock IOOkhz "clock lOOktir, 
clock~ tOkhz .. lo k I Okli~, 
clock l kb 1, • do k I khv, 
clock: IOOl1~ •• 'Iii k I 00111, 










Voice Recognition Appendices 
· clock_lHz ~-> clock_lHz); 
CLOCK: process 
begin 
Clock_cycle <= Clockcycle + l; 
dock_l2mhz <'= 'l'; 
wait for 41.5 us; 
clock_l2rnhz <~ '0'; 
wait for 41.5 ns; 
end process; 
-- *** Test Bench - User Defined Section *<I<* 
-- ***End Test Bench - User Defined Section *** 
END; 
Description/ File name: LED.VHD 
library IBEE~ 
use IEE'E.STD _LOGIC_l 164.ALL; 
use IEEE.STD._LOGIC __ ARITH.ALL; 
use IEEE.STD _LOGIC_ UNSIGNED.ALL; 
entity leddcd bebavioral is 
Port ( d : in sld_logic_ vector(3 downto O); 
s: out std_logic_ vector(6 downto O)); 
end Ieddcdbehaviorul; 
architecture Behavioral of'leddcdbehavioral is 
begin 
<."="1110111" when tl"'''OOOO" el e 
"0010010" when d "0001" else 
"101"I101" when d~''0010" else 
"10110.ll" when d"'"0011" else 
"0111010" when d'"'"OJ 00" else 
"1101011" when cl "0101" else 
"1101111" when d""'"OllO" else 
"1010010" when '0111" else 
"1111111" when d .. "1000" else 
"tlllOll" when d="lOOl" else 
"llllllO" when d"*"lOIO" else 
"0101111" when dx"l011" else 
"0001101" when d= '1100" else 
"0011111" when <F'"ll01" else 
"1101101" when d"'"'lllO" else 
"1101100"; 
end Behavioral; 
Description/ File name: TSILED.VHD 
LIBRARY ieee; 
USE ieet:.std_logic_l 164.ALL; 
USE ieee.numericjstd.Al.L; 
ENTfIY testbcnch IS 
J:!,NJ) restbench; 
ARCHITECTURE: behavior OF testbench lS 
COMT'ONBNT lcd<ictf h hnvioi nl 
POll.T( 
d: lN std log( vc-cto1•(3 dowuto O); 












SIGNAL d: std_logic_vector{3 downto O); 
SIGNAL s : std_logic_ vector(6 downto O); 
BEGIN 




-- *** Test Bench - User Defined Section *"'"' 
tb: PROCESS 
BEGIN 
d -o= "0000"; 
wait for 50ns; -- will wait forever 
d <= "0001''; 
wait for 50ns; + will wait forever 
d <'=" "0010"; 
wait for 50ns; +wil! wait forever 
END PROCESS; 













Voice Recognition References 
·'·'·'·'·'·'·'·'·'·'·'•'•'•'·'·'·'·'·'·'·'·'•'•'·'•'·'·'·'•'·'·'•'·'·'·'·'·'•'•'•'· ········································································•···································•·································•·········································· 
References 
1) [Peter, 1990] " The VHDL Cookbook", University of Adelaide, South 
Australia, Peter J. Ashden, 1990. 
2) [Douglas, 1998] "VHDL (Computer Hardware Description Language)", 
Douglas L. Perry, Mc Graw~I-Iill Co Singapore, 1998. 
3) [ Morris, 2000] "Logic and Computer Design Fundamentals", M. Morris 
Mano and Charles R. Kime, Upper Saddle River, New Jersey, Prentice Hall, Inc, 
2000 
4) [ Andrew,1987] "Robotics and Artificial Intelligence: An. Introduction to 
Applied Machine Intelligence", Andrew C. Staugaard Jr, Prentice Hall, Inc, 1987 
5) [Bahl, 1986] ""Maximum Mutual Information Estimation of Hidden Markov 
Model Parameters for Speech Recognition", Bahl. L., Brown, P., 1986. 
6) [Hirschberg, 1993 ] "Pitch accent in context: Predicting intonational 
prominence from text - Artificial Intelligence" , Hirschberg . .1, 1993. 
7) [Ian, 1982] "Principles of Computer Speech, London", Jan H. Witten, 
Academic Press Inc LTD, 1982. 
8) "Trends in Speech Recognition", Wayne A. Lea, Englewood Jiffs, New Jersey, 
Prentice-Hall, 1980. 
9) "Visible Speech", Ralph K. Poter, George A Kopp, Hakriet Greens Kopp, New 
York, Dover Publications, 1966 
lO)"Language Independent and Language Adaptive Acoustic M deling. In Speech 
Communication", T. Schultz and A. Waibel. Vol 35 August 2001. 
11) "Pitch accent in context: Predicting intonational prominence from text-Artificial 




15) http://www.gear2 l .com/speech/html/inside.html 
16) www.sigda.org/ Archiw8/Procecdifl Archive Compcndiurm 'omp ·11dimn~~oo t/p 




















25) www.es.isy.liu.se/courses/TSTE90/ download/HW-info _ TSTE90.pdf 
26) www.courses.ece.uiuc.edu/ece311/lectures/lecturel 7.ppt 
27)www.egr.msu.edu/annweb/papers/blind.:...separation/ ieee_cas99 _bsr.pdf 
28)www.utdallas.edu/~loizou/loires.html 
29)www.bioid.com/sdk/docs/ About_ Preprocessing.htm 
30),vww.ece.uvic.ca/499/2002a/group05/product/specs.htrnl 
,, 
Un
ive
rsi
ty 
of 
Ma
lay
a
