37 research outputs found

    Phonetic study and text mining of Spanish for English to Spanish translation system

    Get PDF
    Projecte realitzat en col.laboració amb el centre University of Southern Californi

    Phonetic study and text mining of Spanish for English to Spanish translation system

    Get PDF
    Projecte realitzat en col.laboració amb el centre University of Southern Californi

    SPRING: speech and pronunciation improvement through games, for Hispanic children

    Full text link
    Lack of proper English pronunciations is a major problem for immigrant population in developed countries like U.S. This poses various problems, including a barrier to entry into mainstream society. This paper presents a research study that explores the use of speech technologies merged with activity-based and arcade-based games to do pronunciation feedback for Hispanic children within the U.S. A 3-month long study with immigrant population in California was used to investigate and analyze the effectiveness of computer aided pronunciation feedback through games. In addition to quantitative findings that point to statistically significant gains in pronunciation quality, the paper also explores qualitative findings, interaction patterns and challenges faced by the researchers in dealing with this community. It also describes the issues involved in dealing with pronunciation as a competency.Comment: ACM ICTD 201

    Modelo Acústico y de Lenguaje del Idioma Español para el dialecto Cucuteño, Orientado al Reconocimiento Automático del Habla

    Get PDF
     Context: Automatic speech recognition requires the development of language and acoustic models for different existing dialects. The purpose of this research is the training of an acoustic model, a statistical language model and a grammar language model for the Spanish language, specifically for the dialect of the city of San Jose de Cucuta, Colombia, that can be used in a command control system. Existing models for the Spanish language have problems in the recognition of the fundamental frequency and the spectral content, the accent, pronunciation, tone or simply the language model for Cucuta's dialect.Method: in this project, we used Raspberry Pi B+ embedded system with Raspbian operating system which is a Linux distribution and two open source software, namely CMU-Cambridge Statistical Language Modeling Toolkit from the University of Cambridge and CMU Sphinx from Carnegie Mellon University; these software are based on Hidden Markov Models for the calculation of voice parameters. Besides, we used 1913 recorded audios with the voice of people from San Jose de Cucuta and Norte de Santander department. These audios were used for training and testing the automatic speech recognition system.Results: we obtained a language model that consists of two files, one is the statistical language model (.lm), and the other is the jsgf grammar model (.jsgf). Regarding the acoustic component, two models were trained, one of them with an improved version which had a 100 % accuracy rate in the training results and 83 % accuracy rate in the audio tests for command recognition. Finally, we elaborated a manual for the creation of acoustic and language models with CMU Sphinx software.Conclusions: The number of participants in the training process of the language and acoustic models has a significant influence on the quality of the voice processing of the recognizer. The use of a large dictionary for the training process and a short dictionary with the command words for the implementation is important to get a better response of the automatic speech recognition system. Considering the accuracy rate above 80 % in the voice recognition tests, the proposed models are suitable for applications oriented to the assistance of visual or motion impairment people.  Contexto: El reconocimiento automático del habla requiere el desarrollo de modelos de lenguaje y modelos acústicos para los diferentes dialectos que existen. El objeto de esta investigación es el entrenamiento de un modelo acústico, un modelo de lenguaje estadístico y un modelo de lenguaje gramatical para el idioma español, específicamente para el dialecto de la ciudad de San José de Cúcuta, Colombia, que pueda ser utilizado en un sistema de control por comandos. Lo anterior motivado en las deficiencias que presentan los modelos existentes para el idioma español, para el reconocimiento de la frecuencia fundamental y contenido espectral, el acento, la pronunciación, el tono o simplemente al modelo de lenguaje de la variante dialéctica de esta región.Método: Este proyecto utiliza el sistema embebido Raspberry Pi B+ con el sistema operativo Raspbian que es una distribución de Linux, y los softwares de código abierto CMU-Cambridge Statistical Language Modeling toolkit de la Universidad de Cambridge y CMU Sphinx de la Universidad Carnegie Mellon; los cuales se basan en los modelos ocultos de Markov para el cálculo de los parámetros de voz. Además, se utilizaron 1913 audios grabados por locutores de la ciudad de San José de Cúcuta y el departamento de Norte de Santander para el entrenamiento y las pruebas del sistema de reconocimiento automático del habla.Resultados: Se obtuvo un modelo de lenguaje que consiste de dos archivos, uno de modelo de lenguaje estadístico (. lm), y uno de modelo gramatical (. jsgf). Con relación a la parte acústica se entrenaron dos modelos, uno de ellos con una versión mejorada que obtuvo una tasa de acierto en el reconocimiento de comandos del 100% en los datos de entrenamiento y de 83% en las pruebas de audio. Por último, se elaboró un manual para la creación de los modelos acústicos y de lenguaje con el software CMU Sphinx.  Conclusiones: El número de participantes en el proceso de entrenamiento de los modelos acústicos y de lenguaje influye significativamente en la calidad del procesamiento de voz del reconocedor. Para obtener una mejor respuesta del sistema de Reconocimiento Automático del Habla es importante usar un diccionario largo para la etapa de entrenamiento y un diccionario corto con las palabras de comando para la implementación del sistema. Teniendo en cuenta que en las pruebas de reconocimiento se obtuvo una tasa de éxito mayor al 80% es posible usar los modelos creados en el desarrollo de un sistema de Reconocimiento Automático del Habla para una aplicación orientada a la asistencia de personas con discapacidad visual o incapacidad de movimiento

    Aplicación de tecnologías de segmentación de audio y reconocimiento automático de dialecto para la obtención de información de diálogos contenidos en audio

    Get PDF
    El interés de la comunidad científica en la identificación de contenidos audiovisuales ha crecido considerablemente en los últimos años, debido a la necesidad de ejecutar procesos automáticos de clasificación y monitoreo del cada vez mayor contenido transmitido por diferentes medios como televisión, radio e internet. En este artículo se propone una arquitectura para la extracción de información a partir de audio, con la finalidad de aplicarlo al análisis de contenidos televisivos en el contexto ecuatoriano. Para esto, se definen dos servicios, un servicio de segmentación de audio y un servicio de transcripción. El servicio de segmentación identifica y extrae los segmentos de audio que contienen narrativa, música, o narrativa sobre música. Mientras que, el servicio de transcripción hace un reconocimiento de los segmentos de tipo narrativa para obtener su contenido como texto. Estos servicios y las herramientas que los conforman han sido evaluados con el fin de medir su rendimiento y, en el caso de las herramientas usadas, definir cuál de estas es la que mejor se ajusta a la definición de la arquitectura. Los resultados de las evaluaciones realizadas sobre la arquitectura propuesta demuestran que la construcción de un sistema de reconocimiento de habla que haga uso de distintas herramientas de código abierto existentes ofrece un mayor nivel de precisión que un servicio de transcripción de disposición general.The interest of the scientific community in the identification of audiovisual content has grown considerably in recent years, due to the need to execute automatic classification and monitoring processes on the increasing content broadcasted by different media such as television, radio and internet. This article proposes an architecture for extracting information from audio, with the purpose of applying it to the analysis of television contents in the Ecuadorian context. For this, two services are defined, an audio segmentation service and a transcription service. The segmentation service identifies and extracts audio segments containing speech, music, or speech with musical background. Whereas, the transcription service recognizes the speech segments to obtain its content as text. These services and the tools that conform them have been evaluated in order to measure their performance and, in the case of the tools used, to define which of these is the one that best fits the definition of the architecture. The results of the evaluations carried out on the proposed architecture demonstrate that the construction of a speech recognition system, that makes use of different existing open source tools, offers a higher level of precision than a general availability transcription service.Ingeniero de SistemasCuenc

    Master of Arts in Art History

    Get PDF
    thesisTania Candiani's 2012 work Bordadora invites participants to whisper secrets in one of three confessional booths and then uses voice recognition software and an embroidery machine with Computer Numerical Control programming to stitch those secrets onto a tapestry in graffiti style lettering. The work was created for Laboratorio Arte Alameda for the solo show Cinco variaciones de circunstancias fonicas y una pausa (Five Variations on Phonic Circumstance and a Pause). The Laboratorio is a new media art space housed in a former convent that was a site for public executions during the Spanish Inquisition in Mexico from 1596-1771, and the use of religious iconography in Candiani's work indicates that Bordadora is referencing the history of the building. Because of the site specificity of the work and the explicit social commentary in Candiani's oeuvre, the symbolism of the industrial embroidery machine and confessional booths in Bordadora function as a critique of textile manufacturing an

    Desarrollo de una plataforma para la grabación y análisis de somniloquias : una aproximación usando técnicas de análisis de voz

    Get PDF
    La somniloquia es una parasomnia referida al hablar en voz alta durante el sueño. Esta parasomnia ocurre tanto en la fase de sueño REM como en la NREM. La grabación de estas parasomnias puede ser útil para ayudar en el diagnóstico de ciertas patologías psicológicas, dado que pueden reflejar el estado de ansiedad o algunos comportamientos que podrían ser identificados como criterios de diagnóstico psicológico. En este proyecto desarrollamos una plataforma móvil que únicamente graba los momentos en los que una persona habla mientras está dormida (excluyendo otros ruidos o sonidos) y los analiza para identificar la emoción principal de la voz, además de realizar una transcripción de la conversación que se presenta. _____________________________________________________________________________________________________________________Somniloquy is a parasomnia that refers to talking aloud while sleep. This parasomnia occurs in REM and NREM sleep phase. The recording of these parasomnias could be useful to help in the diagnosis of certain psychological pathologies given that they can reflect anxiety status or some behaviors which could be identified as psychological diagnosis criterions. In this project we develop a mobile platform which only records sleep-talking (excluding other noises or sounds) and analyze them to identify the main emotion in the voice, allowing to make a transcription of the conversation is presented.Ingeniería en Informátic

    Semi-Supervised Named Entity Recognition:\ud Learning to Recognize 100 Entity Types with Little Supervision\ud

    Get PDF
    Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper names, biological species, and temporal expressions. There has been growing interest in this field of research since the early 1990s. In this thesis, we document a trend moving away from handcrafted rules, and towards machine learning approaches. Still, recent machine learning approaches have a problem with annotated data availability, which is a serious shortcoming in building and maintaining large-scale NER systems. \ud \ud In this thesis, we present an NER system built with very little supervision. Human supervision is indeed limited to listing a few examples of each named entity (NE) type. First, we introduce a proof-of-concept semi-supervised system that can recognize four NE types. Then, we expand its capacities by improving key technologies, and we apply the system to an entire hierarchy comprised of 100 NE types. \ud \ud Our work makes the following contributions: the creation of a proof-of-concept semi-supervised NER system; the demonstration of an innovative noise filtering technique for generating NE lists; the validation of a strategy for learning disambiguation rules using automatically identified, unambiguous NEs; and finally, the development of an acronym detection algorithm, thus solving a rare but very difficult problem in alias resolution. \ud \ud We believe semi-supervised learning techniques are about to break new ground in the machine learning community. In this thesis, we show that limited supervision can build complete NER systems. On standard evaluation corpora, we report performances that compare to baseline supervised systems in the task of annotating NEs in texts. \u

    Wooster Magazine: Winter 2008

    Get PDF
    The Winter 2008 issue of the Wooster magazine examines coaching insights from Wooster alumni. The landscape in front of Kauke won an award. President Grant Cornwell will be inaugurated on April 25. Alumni Steve Glick discusses his change from history major to the chief of police for the City of Wooster. Studio art major, Kathryn Lehner \u2707 has her independent study project highlighted. First Year Seminar, Small Towns, is featured in this issue. Thirty-three pages are filled with the Class Notes section.https://openworks.wooster.edu/wooalumnimag_2001-2010/1001/thumbnail.jp
    corecore