50,670 research outputs found
Patrol team language identification system for DARPA RATS P1 evaluation
This paper describes the language identification (LID) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We show that techniques originally developed for LID on telephone speech (e.g., for the NIST language recognition evaluations) remain effective on the noisy RATS data, provided that careful consideration is applied when designing the training and development sets. In addition, we show significant improvements from the use of Wiener filtering, neural network based and language dependent i-vector modeling, and fusion
New methods for robust speech recognition
Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Ph.D.) -- Bilkent University, 1995.Includes bibliographical references leaves 86-92.New methods of feature extraction, end-point detection and speech enhcincement
are developed for a robust speech recognition system.
The methods of feature extraction and end-point detection are based on
wavelet analysis or subband analysis of the speech signal. Two new sets of speech
feature parameters, SUBLSF’s and SUBCEP’s, are introduced. Both parameter
sets are based on subband analysis. The SUBLSF feature parameters are obtained
via linear predictive analysis on subbands. These speech feature parameters
can produce better results than the full-band parameters when the noise is
colored. The SUBCEP parameters are based on wavelet analysis or equivalently
the multirate subband analysis of the speech signal. The SUBCEP parameters
also provide robust recognition performance by appropriately deemphasizing the
frequency bands corrupted by noise. It is experimentally observed that the
subband analysis based feature parameters are more robust than the commonly
used full-band analysis based parameters in the presence of car noise.
The a-stable random processes can be used to model the impulsive nature of the public network telecommunication noise. Adaptive filtering are developed
for Q-stable random processes. Adaptive noise cancelation techniques are used to
reduce the mismacth between training and testing conditions of the recognition
system over telephone lines.
Another important problem in isolated speech recognition is to determine
the boundaries of the speech utterances or words. Precise boundary detection
of utterances improves the performance of speech recognition systems. A new
distance measure based on the subband energy levels is introduced for endpoint
detection.Erzin, EnginPh.D
- …