Search CORE

3 research outputs found

Improving bottleneck features for Vietnamese large vocabulary continuous speech recognition system using deep neural networks

Author: Luong Mai Chi
Nguyen Bao Quoc
Vu Thang Tat
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 03/01/2016
Field of study

In this paper, the pre-training method based on denoising auto-encoder is investigated and proved to be good models for initializing bottleneck networks of Vietnamese speech recognition system that result in better recognition performance compared to base bottleneck features reported previously. The experiments are carried out on the dataset containing speeches on Voice of Vietnam channel (VOV). The results show that the DBNF extraction for Vietnamese recognition decreases relative word error rate by 14 % and 39 % compared to the base bottleneck features and MFCC baseline, respectively

Vietnam Academy of Science and Technology: Journals Online

Reconocimiento de patrones de habla usando MFCC y RNA

Author: Góngora Leonardo A.
Ramos Olga L.
Rojas Diego A.
Publication venue: 'Universidad Distrital Francisco Jose de Caldas'
Publication date: 20/06/2016
Field of study

In this work the results of the design and development of an algorithm based on artificial intelligence and MFCC for recognizing speech patterns are presented. The using of MFCC allowed to characterize voice signals, having into account the noise in the record environment, which helps with the estimation of common patterns among these signals when presents disturbances. As a main result of this work, a recognizing rate between 93 and 96% for the selected vowels (/a/,/e/,/o/) was achieved. For the training a number of 22 samples were used and others 11 for the validation process. The samples were obtained from 11 test subjects, all of them of male genre.En este trabajo se presentan los resultados del diseño y desarrollo de un algoritmo basado en inteligencia artificial para el reconocimiento de patrones de vocablos del idioma español, utilizando Coeﬁcientes Cepstrales en las Frecuencias de Mel o (MFCC), para representar el habla a través de la percepción auditiva del ser humano. La utilización de MFCC permitió caracterizar las señales de voz teniendo en cuenta el posible ruido presente en el ambiente de grabación, lo cual ayudo a la obtención de patrones comunes entre estas señales cuando presentan alteraciones. Como resultado se obtuvo un reconocimiento superior al 95% de las tres vocales escogidas, en este caso la /a/,/e/,/o/, entre un grupo de 22 muestras por vocal para el entrenamiento y 11 muestras para la validación. Las muestras fueron obtenidas de 11 personas, todas del género masculino

Universidad Distrital de la ciudad de Bogotá: Open Journal Systems

Improving acoustic model for English ASR System using deep neural network

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref