3 research outputs found

    Improving bottleneck features for Vietnamese large vocabulary continuous speech recognition system using deep neural networks

    Get PDF
    In this paper, the pre-training method based on denoising auto-encoder is investigated and proved to be good models for initializing bottleneck networks of Vietnamese speech recognition system that result in better recognition performance compared to base bottleneck features reported previously. The experiments are carried out on the dataset containing speeches on Voice of Vietnam channel (VOV). The results show that the DBNF extraction for Vietnamese recognition decreases relative word error rate by 14 % and 39 % compared to the base bottleneck features and MFCC baseline, respectively

    Reconocimiento de patrones de habla usando MFCC y RNA

    Get PDF
    In this work the results of the design and development of an algorithm based on artificial intelligence and MFCC for recognizing speech patterns are presented. The using of MFCC allowed to characterize voice signals, having into account the noise in the record environment, which helps with the estimation of common patterns among these signals when presents disturbances. As a main result of this work, a recognizing rate between 93 and 96% for the selected vowels (/a/,/e/,/o/) was achieved. For the training a number of 22 samples were used and others 11 for the validation process. The samples were obtained from 11 test subjects, all of them of male genre.En este trabajo se presentan los resultados del dise帽o y desarrollo de un algoritmo basado en inteligencia artificial para el reconocimiento de patrones de vocablos del idioma espa帽ol, utilizando Coe铿乧ientes Cepstrales en las Frecuencias de Mel o (MFCC), para representar el habla a trav茅s de la percepci贸n auditiva del ser humano. La utilizaci贸n de MFCC permiti贸 caracterizar las se帽ales de voz teniendo en cuenta el posible ruido presente en el ambiente de grabaci贸n, lo cual ayudo a la obtenci贸n de patrones comunes entre estas se帽ales cuando presentan alteraciones. Como resultado se obtuvo un reconocimiento superior al 95% de las tres vocales escogidas, en este caso la /a/,/e/,/o/, entre un grupo de 22 muestras por vocal para el entrenamiento y 11 muestras para la validaci贸n. Las muestras fueron obtenidas de 11 personas, todas del g茅nero masculino

    Improving acoustic model for English ASR System using deep neural network

    No full text