20 research outputs found
Speech spectrum non-stationarity detection based on line spectrum frequencies and related applications
Ankara : Department of Electrical and Electronics Engineering and The Institute of Engineering and Sciences of Bilkent University, 1998.Thesis (Master's) -- Bilkent University, 1998.Includes bibliographical references leaves 124-132In this thesis, two new speech variation measures for speech spectrum nonstationarity
detection are proposed. These measures are based on the Line
Spectrum Frequencies (LSF) and the spectral values at the LSF locations.
They are formulated to be subjectively meaningful, mathematically tractable,
and also have low computational complexity property. In order to demonstrate
the usefulness of the non-stationarity detector, two applications are presented:
The first application is an implicit speech segmentation system which detects
non-stationary regions in speech signal and obtains the boundaries of the speech
segments. The other application is a Variable Bit-Rate Mixed Excitation Linear
Predictive (VBR-MELP) vocoder utilizing a novel voice activity detector
to detect silent regions in the speech. This voice activity detector is designed
to be robust to non-stationary background noise and provides efficient coding
of silent sections and unvoiced utterances to decrease the bit-rate. Simulation
results are also presented.Ertan, Ali ErdemM.S
Comparison of CELP speech coder with a wavelet method
This thesis compares the speech quality of Code Excited Linear Predictor (CELP, Federal Standard 1016) speech coder with a new wavelet method to compress speech. The performances of both are compared by performing subjective listening tests. The test signals used are clean signals (i.e. with no background noise), speech signals with room noise and speech signals with artificial noise added. Results indicate that for clean signals and signals with predominantly voiced components the CELP standard performs better than the wavelet method but for signals with room noise the wavelet method performs much better than the CELP. For signals with artificial noise added, the results are mixed depending on the level of artificial noise added with CELP performing better for low level noise added signals and the wavelet method performing better for higher noise levels
Residual-excited linear predictive (RELP) vocoder system with TMS320C6711 DSK and vowel characterization
The area of speech recognition by machine is one of the most popular and complicated subjects in the current multimedia field. Linear predictive coding (LPC) is a useful technique for voice coding in speech analysis and synthesis. The first objective of this research was to establish a prototype of the residual-excited linear predictive (RELP) vocoder system in a real-time environment. Although its transmission rate is higher, the quality of synthesized speech of the RELP vocoder is superior to that of other vocoders. As well, it is rather simple and robust to implement. The RELP vocoder uses residual signals as excitation rather than periodic pulse or white noise. The RELP vocoder was implemented with Texas Instruments TMS320C6711 DSP starter kit (DSK) using C.
Identifying vowel sounds is an important element in recognizing speech contents. The second objective of research was to explore a method of characterizing vowels by means of parameters extracted by the RELP vocoder, which was not known to have been used in speech recognition, previously. Five English vowels were chosen for the experimental sample. Utterances of individual vowel sounds and of the vowel sounds in one-syllable-words were recorded and saved as WAVE files. A large sample of 20-ms vowel segments was obtained from these utterances. The presented method utilized 20 samples of a segment's frequency response, taken equally in logarithmic scale, as a LPC frequency response vector. The average of each vowel's vectors was calculated. The Euclidian distances between the average vectors of the five vowels and an unknown vector were compared to classify the unknown vector into a certain vowel group.
The results indicate that, when a vowel is uttered alone, the distance to its average vector is smaller than to the other vowels' average vectors. By examining a given vowel frequency response against all known vowels' average vectors, individually, one can determine to which vowel group the given vowel belongs. When a vowel is uttered with consonants, however, variances and covariances increase. In some cases, distinct differences may not be recognized among the distances to a vowel's own average vector and the distances to the other vowels' average vectors. Overall, the results of vowel characterization did indicate an ability of the RELP vocoder to identify and classify single vowel sounds
Speech coding at medium bit rates using analysis by synthesis techniques
Speech coding at medium bit rates using analysis by synthesis technique
Sine-wave amplitude coding using wavelet basis functions
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (leaves 106-109).by Pankaj Oberoi.M.S
New methods for robust speech recognition
Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Ph.D.) -- Bilkent University, 1995.Includes bibliographical references leaves 86-92.New methods of feature extraction, end-point detection and speech enhcincement
are developed for a robust speech recognition system.
The methods of feature extraction and end-point detection are based on
wavelet analysis or subband analysis of the speech signal. Two new sets of speech
feature parameters, SUBLSF’s and SUBCEP’s, are introduced. Both parameter
sets are based on subband analysis. The SUBLSF feature parameters are obtained
via linear predictive analysis on subbands. These speech feature parameters
can produce better results than the full-band parameters when the noise is
colored. The SUBCEP parameters are based on wavelet analysis or equivalently
the multirate subband analysis of the speech signal. The SUBCEP parameters
also provide robust recognition performance by appropriately deemphasizing the
frequency bands corrupted by noise. It is experimentally observed that the
subband analysis based feature parameters are more robust than the commonly
used full-band analysis based parameters in the presence of car noise.
The a-stable random processes can be used to model the impulsive nature of the public network telecommunication noise. Adaptive filtering are developed
for Q-stable random processes. Adaptive noise cancelation techniques are used to
reduce the mismacth between training and testing conditions of the recognition
system over telephone lines.
Another important problem in isolated speech recognition is to determine
the boundaries of the speech utterances or words. Precise boundary detection
of utterances improves the performance of speech recognition systems. A new
distance measure based on the subband energy levels is introduced for endpoint
detection.Erzin, EnginPh.D
The mobile satellite service (MSS) systems for global personal communications
A worldwide interest has arisen on personal communications via satellite systems. The recently proposed mobile satellite service(MSS) systems are categorized four areas: geostationary earth orbit(GEO) systems, medium earth orbit(MEO) systems, low earth orbit(LEO) systems, and highly elliptical orbit(HEO) systems. Most of the systems in each category are introduced and explained including some technical details. The communication links and orbital constellations of some systems are analyzed and compared with different categories, and with different systems. Some economical aspects of the systems are mentioned. The regulatory issues about frequency spectrum allocation, and the current technical trends in these systems are summarized
Proceedings of the Mobile Satellite System Architectures and Multiple Access Techniques Workshop
The Mobile Satellite System Architectures and Multiple Access Techniques Workshop served as a forum for the debate of system and network architecture issues. Particular emphasis was on those issues relating to the choice of multiple access technique(s) for the Mobile Satellite Service (MSS). These proceedings contain articles that expand upon the 12 presentations given in the workshop. Contrasting views on Frequency Division Multiple Access (FDMA), Code Division Multiple Access (CDMA), and Time Division Multiple Access (TDMA)-based architectures are presented, and system issues relating to signaling, spacecraft design, and network management constraints are addressed. An overview article that summarizes the issues raised in the numerous discussion periods of the workshop is also included
Proceedings of the Second International Mobile Satellite Conference (IMSC 1990)
Presented here are the proceedings of the Second International Mobile Satellite Conference (IMSC), held June 17-20, 1990 in Ottawa, Canada. Topics covered include future mobile satellite communications concepts, aeronautical applications, modulation and coding, propagation and experimental systems, mobile terminal equipment, network architecture and control, regulatory and policy considerations, vehicle antennas, and speech compression