14,242 research outputs found
Learning to automatically detect features for mobile robots using second-order Hidden Markov Models
In this paper, we propose a new method based on Hidden Markov Models to
interpret temporal sequences of sensor data from mobile robots to automatically
detect features. Hidden Markov Models have been used for a long time in pattern
recognition, especially in speech recognition. Their main advantages over other
methods (such as neural networks) are their ability to model noisy temporal
signals of variable length. We show in this paper that this approach is well
suited for interpretation of temporal sequences of mobile-robot sensor data. We
present two distinct experiments and results: the first one in an indoor
environment where a mobile robot learns to detect features like open doors or
T-intersections, the second one in an outdoor environment where a different
mobile robot has to identify situations like climbing a hill or crossing a
rock.Comment: 200
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.Sociedad Argentina de Informática e Investigación Operativa (SADIO
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.Sociedad Argentina de Informática e Investigación Operativa (SADIO
Speaker Identification and Spoken word Recognition in Noisy Environment using Different Techniques
In this work, an attempt is made to design ASR systems through software/computer programs which would perform Speaker Identification, Spoken word recognition and combination of both speaker identification and Spoken word recognition in general noisy environment. Automatic Speech Recognition system is designed for Limited vocabulary of Telugu language words/control commands. The experiments are conducted to find the better combination of feature extraction technique and classifier model that will perform well in general noisy environment (Home/Office environment where noise is around 15-35 dB). A recently proposed features extraction technique Gammatone frequency coefficients which is reported as the best fit to the human auditory system is chosen for the experiments along with the more common feature extraction techniques MFCC and PLP as part of Front end process (i.e. speech features extraction). Two different Artificial Neural Network classifiers Learning Vector Quantization (LVQ) neural networks and Radial Basis Function (RBF) neural networks along with Hidden Markov Models (HMMs) are chosen for the experiments as part of Back end process (i.e. training/modeling the ASRs). The performance of different ASR systems that are designed by utilizing the 9 different combinations (3 feature extraction techniques and 3 classifier models) are analyzed in terms of spoken word recognition and speaker identification accuracy success rate, design time of ASRs, and recognition / identification response time .The testing speech samples are recorded in general noisy conditions i.e.in the existence of air conditioning noise, fan noise, computer key board noise and far away cross talk noise. ASR systems designed and analyzed programmatically in MATLAB 2013(a) Environment
Learning to automatically detect features for mobile robots using second-order Hidden Markov Models
In this paper, we propose a new method based on Hidden Markov Models to interpret temporal sequences of sensor data from mobile robots to automaticall- y detect features. Hidden Markov Models have been used for a long time in pattern recognition, especially in speech recognition. Their main advantages over other methods (such as neural networks) are their ability to model noisy temporal signals of variable length. We show in this paper that this approach is well suited for interpretation of temporal sequences of mobile-robot sensor data. We present two distinct experiments and results: the first one in an indoor environment where a mobile robot learns to detect features like open doors or T-intersections, the second one in an outdoor environment where a different mobile robot has to identify situations like climbing a hill or crossing a rock
Robust audiovisual speech recognition using noise-adaptive linear discriminant analysis
© 2016 IEEE.Automatic speech recognition (ASR) has become a widespread and convenient mode of human-machine interaction, but it is still not sufficiently reliable when used under highly noisy or reverberant conditions. One option for achieving far greater robustness is to include another modality that is unaffected by acoustic noise, such as video information. Currently the most successful approaches for such audiovisual ASR systems, coupled hidden Markov models (HMMs) and turbo decoding, both allow for slight asynchrony between audio and video features, and significantly improve recognition rates in this way. However, both typically still neglect residual errors in the estimation of audio features, so-called observation uncertainties. This paper compares two strategies for adding these observation uncertainties into the decoder, and shows that significant recognition rate improvements are achievable for both coupled HMMs and turbo decoding
- …