4 research outputs found

    Adaptation to new microphones using tied-mixture normalization

    No full text
    In this paper, we present several approaches designed to increase the robustness of BYBLOS, the BBN continuous peech recogni-tion system. We address the problem of increased egradation i • performance when there is mismatch in the characteristics of the training and the test microphones. We introduce a new supervised adaptafi.~n algor/thm that computes a transformation from the train-hag microphone codebook to that of a new microphone, given some information about the new microphone. Results are reported for the development and evaluation test sets of the 1993 ARPA CSR Spoke 6 WSJ task, which consist of speech recorded with two al- • temate microphones, a stand-mount and a telephone microphone. The proposed algorithm improves the performance of the system • • when tested with the stand-mount microphone by reducing the dif-ference ha error rate between the high quality training microphone and the alternate stand-mount microphone recordings by a factor of 2. Several results are presented for the telephone speech leading • to important conclusions: a) the performance on telephone speech is dramaticaUy improved by simply retraining the system on the high-quality training data after they have been bandlimited in the telephone bandwith; and b) additional training data recorded with the high quality microphone give luther substantial improvement ha performance. 1

    Speech recognition system robustness to microphone variations

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 99-101).by Jane W. Chang.M.S

    A System for Simultaneous Translation of Lectures and Speeches

    Get PDF
    This thesis realizes the first existing automatic system for simultaneous speech-to-speech translation. The focus of this system is the automatic translation of (technical oriented) lectures and speeches from English to Spanish, but the different aspects described in this thesis will also be helpful for developing simultaneous translation systems for other domains or languages

    Aportación a la extracción paramétrica en reconocimiento de voz robusto basada en la aplicación de conocimiento de fonética acústica

    Full text link
    This thesis is based on the following hypothesis: the introduction of direct knowledge from the acoustic-phonetic field to the speech recognition problem, especially in the feature extraction step, may constitute a solid base of analysis for the determination of the behavior and capabilities of those systems and their improvement, as well. Most of the complexity of this Ph.D. thesis comes from the different subjects related with the speech processing área. The application of acoustic-phonetic information to the speech recognition research área implies a deep knowledge of both subjects. The research carried out in this work has been divided in two main parts: analysis of the current feature extraction methods and a study of several possible procedures about the incorporation of phonetic-acoustic knowledge to those systems. Abundant recognition and related quality measure results are presented for 50 different parameter extraction models. Details about the real-time implementation on a DSP platform (TMS3230C31-60) of two different parameter extraction models are presented. Finally, a set of computer tools developed for building and testing new speech recognition systems has been produced. Besides, the application of several results from this work can be extended to other speech processing áreas, such as computer assisted language learning, linguistic rehabilitation, etc.---ABSTRACT---La hipótesis en la que se basa el desarrollo de esta tesis, se centra en la suposición de que la aportación de conocimiento directo, proveniente del campo de la fonética acústica, al problema del reconocimiento automático de la voz, en concreto a la etapa de extracción de características, puede constituir una base sólida con la que poder analizar el comportamiento y capacidad de discriminación de dichos sistemas, así como una forma de mejorar sus prestaciones. Parte de la complejidad que presenta esta tesis doctoral, viene motivada por las diferentes disciplinas que están relacionadas con el área de procesamiento de la voz. La aplicación de información fonética-acústica al campo de investigación del reconocimiento del habla requiere un amplio conocimiento de ambas materias. Las investigaciones desarrolladas en este trabajo se han dividido en dos bloques fundamentales: análisis de los métodos actuales de extracción de rasgos fonéticos y un estudio de algunas posibles formas de incorporación de conocimiento fonético-acústico a dichos sistemas. En esta tesis se ofrecen abundantes resultados relativos a tasas de reconocimiento y medidas acerca de la calidad de este proceso, para un total de 50 modelos de extracción de parámetros. Así mismo se incluyen los detalles de la implementación en tiempo real para una plataforma DSP, en concreto TMS320C31-60, de dos diferentes modelos de extracción de rasgos. Además, se ha desarrollado un conjunto de las herramientas informáticas que pueden servir de base para construir y validar de forma sencilla, nuevos sistemas de reconocimiento. La aplicación de algunos de los resultados del trabajo puede extenderse también a otras áreas del tratamiento de la voz, tales como la enseñanza de una segunda lengua, logopedia, etc