7 research outputs found

    Marathi Speech Interface System for the Activation and Controlling of Electronic Equipment

    Get PDF
    This paper provides a framework to design Marathi speech recognition interface system to control the electronic equipment like fan and electric bulb. This system is designed using HM2007 speech recognition kit. HM2007 is a single chip CMOS LSI circuit with on chip front end and analysis. The interface system is trained with the database of total 15 most commonly used words to activate and control the equipment. The training and testing of Marathi speech interface system is done in noisy and noise free environment. The performance of MSAEE system calculated on the basis of False Acceptance Rate (FAR), False Recognition Rate (FRR), Word Error Rate (WER), sensitivity, specificity and F-Measure. The average accuracy obtained in noisy environment is 88.77% whereas the accuracy of 98.88% is achieved without noise. The sensitivity value is superior then specificity means MSAEE system is robust and dynamic. FAR values is most significant than value of FRR so, it means MSAEE system proved a best solution for controlling electrical appliances. HM2007 is found to be most promising speech recognition IC, through which many speech interface application can be built that will be beneficial to people with less mobility. DOI: 10.17762/ijritcc2321-8169.160411

    Voice Link : a speech interface fore responsive media

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2002.Includes bibliographical references (p. 65-66).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.We developed VoiceLink, a speech interface package for responsive media applications. It contains a set of speech interface modules that can interface with various multimedia applications written in Isis, a scripting programming language created at the MIT Media Laboratory. Specifically, we designed two command-and-control voice interfaces, one for iCom, a multi-point audio/video communication system, and another for HyperSoap, a hyperlinked TV program. The iCom module enables users to control an iCom station using voice commands while the HyperSoap module allows viewers to select objects and access related information by saying objects' names. We also built a speech software library for Isis, which allows users to develop speech aware applications in the Isis programming environment. We addressed a number of problems when designing VoiceLink. In the case of the iCom module, visual information is used to seamlessly inform users of voice commands and to provide them with instant feedback and instructions, making the speech interface intuitive, flexible and easy to use for novice users. The major challenge for the HyperSoap module is the open vocabulary problem for object selection. In our design, an item list is displayed on the screen upon viewers' request to show them selectable objects. We also created an object name index to model how viewers may call objects spontaneously. Using a combination of item list and name index in the HyperSoap module produced fairly robust performance, making the speech interface a useful alternative to traditional pointing devices. The result of user evaluation is encouraging. It showed that a speech based interface for responsive media applications is not only useful but also practical.by Yi Li.S.M

    Arabic goal-oriented conversational agents using semantic similarity techniques

    Get PDF
    Conversational agents (CAs) are computer programs used to interact with humans in conversation. Goal-Oriented Conversational agents (GO-CAs) are programs that interact with humans to serve a specific domain of interest; its’ importance has increased recently and covered fields of technology, sciences and marketing. There are several types of CAs used in the industry, some of them are simple with limited usage, others are sophisticated. Generally, most CAs were to serve the English language speakers, a few were built for the Arabic language, this is due to the complexity of the Arabic language, lack of researchers in both linguistic and computing. This thesis covered two types of GO-CAs. The first is the traditional pattern matching goal oriented CA (PMGO-CA), and the other is the semantic goal oriented CA (SGO-CA). Pattern matching conversational agents (PMGO-CA) techniques are widely used in industry due to their flexibility and high performance. However, they are labour intensive, difficult to maintain or update, and need continuous housekeeping to manage users’ utterances (especially when instructions or knowledge changes). In addition to that they lack for any machine intelligence. Semantic conversational agents (SGO-CA) techniques utilises humanly constructed knowledge bases such as WordNet to measure word and sentence similarity. Such measurement witnessed many researches for the English language, and very little for the Arabic language. In this thesis, the researcher developed a novelty of a new methodology for the Arabic conversational agents (using both Pattern Matching and Semantic CAs), starting from scripting, knowledge engineering, architecture, implementation and evaluation. New tools to measure the word and sentence similarity were also constructed. To test performance of those CAs, a domain representing the Iraqi passport services was built. Both CAs were evaluated and tested by domain experts using special evaluation metrics. The evaluation showed very promising results, and the viability of the system for real life

    Desarrollo y evaluación de diferentes metodologías para la gestión automática del diálogo

    Full text link
    El objetivo principal de la tesis que se presenta es el estudio y desarrollo de diferentes metodologías para la gestión del diálogo en sistemas de diálogo hablado. El principal reto planteado en la tesis reside en el desarrollo de metodologías puramente estadísticas para la gestión del diálogo, basadas en el aprendizaje de un modelo a partir de un corpus de diálogos etiquetados. En este campo, se presentan diferentes aproximaciones para realizar la gestión, la mejora del modelo estadístico y la evaluación del sistema del diálogo. Para la implementación práctica de estas metodologías, en el ámbito de una tarea específica, ha sido necesaria la adquisición y etiquetado de un corpus de diálogos. El hecho de disponer de un gran corpus de diálogos ha facilitado el aprendizaje y evaluación del modelo de gestión desarrollado. Así mismo, se ha implementado un sistema de diálogo completo, que permite evaluar el funcionamiento práctico de las metodologías de gestión en condiciones reales de uso. Para evaluar las técnicas de gestión del diálogo se proponen diferentes aproximaciones: la evaluación mediante usuarios reales; la evaluación con el corpus adquirido, en el cual se han definido unas particiones de entrenamiento y prueba; y la utilización de técnicas de simulación de usuarios. El simulador de usuario desarrollado permite modelizar de forma estadística el proceso completo del diálogo. En la aproximación que se presenta, tanto la obtención de la respuesta del sistema como la generación del turno de usuario se modelizan como un problema de clasificación, para el que se codifica como entrada un conjunto de variables que representan el estado actual del diálogo y como resultado de la clasificación se obtienen las probabilidades de seleccionar cada una de las respuestas (secuencia de actos de diálogo) definidas respectivamente para el usuario y el sistema.Griol Barres, D. (2007). Desarrollo y evaluación de diferentes metodologías para la gestión automática del diálogo [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/1956Palanci

    A dynamic multi-application dialog engine for task-oriented voice user interfaces

    Get PDF
    This thesis introduces the Dymalog framework for spoken language dialog systems, which separates the applications from the actual dialog system. It facilitates the control of a plurality of applications through a single dialog system, changeable during run time. This is achieved by application-independent knowledge processing inside the dialog system, based on a hierarchical representation of obtained information (o²I -Trees). The approach enables the realization of generic dialog functionalities. Dymalog is composed of a collection of components; each serves mainly a single purpose. It fosters the generation of competing hypotheses during the processing of the user input in order to derive an optimal interpretation at a certain stage in the processing. The Marvin dialog system puts Dymalog into practice. We discuss selected interactions with various applications enabled for the operation through the system. The parameterized hypothesis selection process is considered in detail, especially the parameter estimation algorithm Grail, and the same holds for the development process in the generation of competing hypotheses for the user input.Die Arbeit stellt die Grundlagen zur Realisierung des sprachbasierten Dialogsystems Marvin für die Interaktion eines Benutzers mit verschiedenen Applikationen vor: Dymalog. Es erlaubt die Kontrolle unterschiedlicher Applikationen durch ein einziges System und ermöglicht u.a. dynamische Änderungen der verfügbaren Applikationen zur Laufzeit. Dies wird durch applikationsunabhängige Wissensverarbeitung erreicht, basierend auf modularen ontologischen Beschreibungen der Anwendungsfreiheitsgrade (o²I -Trees). Die Trennung von Dialogsystem und Applikationen ermöglicht die Realisierung generischer Dialogfunktionalitäten. Dymalog besteht aus einer Reihe von separaten Einheiten, jede beinhaltet im Wesentlichen ein Modell zur Verarbeitung der Benutzereingabe. Um die optimale Interpretation der Benutzereingabe zu erlangen wird die Generation alternativer Interpretationen gefördert. Das Marvin Dialogsystem realisiert die Konzepte aus Dymalog. Ausgewählte Interaktionen mit verschiedenen Applikationen werden diskutiert. Ferner wird der parameterisierte Auswahlprozeß der \u27besten\u27; Interpretation beleuchtet, insbesondere der Parameter-Schätzalgorithmus Grail, und die Erzeugung alternativer Hypothesen durch ausgewählte Einheiten diskutiert

    Towards A Universal Speech Interface

    No full text
    We discuss our ongoing attempt to design and evaluate universal human-machine speech-based interfaces. We describe one such initial design suitable for database retrieval applications, and discuss its implementation in a movie information application prototype. Initial user studies provided encouraging results regarding the usability of the design, as well as suggest some questions for further investigation

    ABSTRACT TOWARDS A UNIVERSAL SPEECH INTERFACE

    No full text
    We discuss our ongoing attempt to design and evaluate universal human-machine speech-based interfaces. We describe one such initial design suitable for database retrieval applications, and discuss its implementation in a movie information application prototype. Initial user studies provided encouraging results regarding the usability of the design, as well as suggest some questions for further investigation. 1
    corecore