Search CORE

91 research outputs found

The Science and Art of Voice Interfaces

Author: Krahmer E.J.
Publication venue: Philips
Publication date: 01/01/2001
Field of study

Crowd-supervised training of spoken language systems

Author: McGraw Ian C. (Ian Carmichael)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-166).Spoken language systems are often deployed with static speech recognizers. Only rarely are parameters in the underlying language, lexical, or acoustic models updated on-the-fly. In the few instances where parameters are learned in an online fashion, developers traditionally resort to unsupervised training techniques, which are known to be inferior to their supervised counterparts. These realities make the development of spoken language interfaces a difficult and somewhat ad-hoc engineering task, since models for each new domain must be built from scratch or adapted from a previous domain. This thesis explores an alternative approach that makes use of human computation to provide crowd-supervised training for spoken language systems. We explore human-in-the-loop algorithms that leverage the collective intelligence of crowds of non-expert individuals to provide valuable training data at a very low cost for actively deployed spoken language systems. We also show that in some domains the crowd can be incentivized to provide training data for free, as a byproduct of interacting with the system itself. Through the automation of crowdsourcing tasks, we construct and demonstrate organic spoken language systems that grow and improve without the aid of an expert. Techniques that rely on collecting data remotely from non-expert users, however, are subject to the problem of noise. This noise can sometimes be heard in audio collected from poor microphones or muddled acoustic environments. Alternatively, noise can take the form of corrupt data from a worker trying to game the system - for example, a paid worker tasked with transcribing audio may leave transcripts blank in hopes of receiving a speedy payment. We develop strategies to mitigate the effects of noise in crowd-collected data and analyze their efficacy. This research spans a number of different application domains of widely-deployed spoken language interfaces, but maintains the common thread of improving the speech recognizer's underlying models with crowd-supervised training algorithms. We experiment with three central components of a speech recognizer: the language model, the lexicon, and the acoustic model. For each component, we demonstrate the utility of a crowd-supervised training framework. For the language model and lexicon, we explicitly show that this framework can be used hands-free, in two organic spoken language systems.by Ian C. McGraw.Ph.D

DSpace@MIT

Recommended from our members

Retrieving information from heterogeneous freight data sources to answer natural language queries

Author: Seedah Dan Paapanyin Kofi
Publication venue
Publication date: 09/02/2015
Field of study

textThe ability to retrieve accurate information from databases without an extensive knowledge of the contents and organization of each database is extremely beneficial to the dissemination and utilization of freight data. The challenges, however, are: 1) correctly identifying only the relevant information and keywords from questions when dealing with multiple sentence structures, and 2) automatically retrieving, preprocessing, and understanding multiple data sources to determine the best answer to user’s query. Current named entity recognition systems have the ability to identify entities but require an annotated corpus for training which in the field of transportation planning does not currently exist. A hybrid approach which combines multiple models to classify specific named entities was therefore proposed as an alternative. The retrieval and classification of freight related keywords facilitated the process of finding which databases are capable of answering a question. Values in data dictionaries can be queried by mapping keywords to data element fields in various freight databases using ontologies. A number of challenges still arise as a result of different entities sharing the same names, the same entity having multiple names, and differences in classification systems. Dealing with ambiguities is required to accurately determine which database provides the best answer from the list of applicable sources. This dissertation 1) develops an approach to identify and classifying keywords from freight related natural language queries, 2) develops a standardized knowledge representation of freight data sources using an ontology that both computer systems and domain experts can utilize to identify relevant freight data sources, and 3) provides recommendations for addressing ambiguities in freight related named entities. Finally, the use of knowledge base expert systems to intelligently sift through data sources to determine which ones provide the best answer to a user’s question is proposed.Civil, Architectural, and Environmental Engineerin

Texas ScholarWorks

Chatter--a conversational telephone agent

Author: Ly Eric Thich Vi
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1993
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1993.Includes bibliographical references (leaves 126-130).by Eric Thich Vi Ly.M.S

DSpace@MIT

Detecting autism, emotions and social signals using AdaBoost

Author: Busa-Fekete Róbert
Gosztolya Gábor
Tóth László
Publication venue: Interspeech
Publication date: 01/01/2013
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Introduction to scientific writing assistant (SWAN) - Tool for evaluating the quality of scientifc manuscripts

Author: Turunen Teemu P
Publication venue: University of Eastern Finland
Publication date
Field of study

UEF Electronic Publications

THaW publications

Author: Kotz David
Landwehr Carl
Publication venue: Dartmouth Digital Commons
Publication date: 11/12/2020
Field of study

In 2013, the National Science Foundation\u27s Secure and Trustworthy Cyberspace program awarded a Frontier grant to a consortium of four institutions, led by Dartmouth College, to enable trustworthy cybersystems for health and wellness. As of this writing, the Trustworthy Health and Wellness (THaW) project\u27s bibliography includes more than 130 significant publications produced with support from the THaW grant; these publications document the progress made on many fronts by the THaW research team. The collection includes dissertations, theses, journal papers, conference papers, workshop contributions and more. The bibliography is organized as a Zotero library, which provides ready access to citation materials and abstracts and associates each work with a URL where it may be found, cluster (category), several content tags, and a brief annotation summarizing the work\u27s contribution. For more information about THaW, visit thaw.org

Dartmouth Digital Commons (Dartmouth College)

Formalizing knowledge used in spectrogram reading : acoustic and perceptual evidence from stops

Author
Publication venue: Research Laboratory of Electronics, Massachusetts Institute of Technology
Publication date: 01/01/1988
Field of study

Also issued as Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.Includes bibliographical references.Supported by the Defense Advanced Research Projects Agency, Vinton-Hayes, Bell Laboratories (GRPW), and Inference Corporation.Lori Faith Lamel

DSpace@MIT

Hybrid discourse modeling and summarization for a speech-to-speech translation system

Author: Alexandersson Jan
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2003
Field of study

The thesis discusses two parts of the speech-to-speech translation system VerbMobil: the dialogue model and one of its applications, multilingual summary generation. In connection with the dialogue model, two topics are of special interest: (a) the use of a default unification operation called overlay as the fundamental operation for dialogue management; and (b) an intentional model that is able to describe intentions in dialogue on five levels in a language-independent way. Besides the actual generation algorithm developed, we present a comprehensive evaluation of the summarization functionality. In addition to precision and recall, a new characterization - confabulation - is defined that provides a more precise understanding of the performance of complex natural language processing systems.Die vorliegende Arbeit behandelt hauptsächlich zwei Themen, die für das VerbMobil-System, ein Übersetzungssystem gesprochener Spontansprache, entwickelt wurden: das Dialogmodell und als Applikation die multilinguale Generierung von Ergebnissprotokollen. Für die Dialogmodellierung sind zwei Themen von besonderem Interesse. Das erste behandelt eine in der vorliegenden Arbeit formalisierte Default-Unifikations-Operation namens Overlay, die als fundamentale Operation für Diskursverarbeitung dient. Das zweite besteht aus einem intentionalen Modell, das Intentionen eines Dialogs auf fünf Ebenen in einer sprachunabhängigen Repräsentation darstellt. Neben dem für die Protokollgenerierung entwickelten Generierungsalgorithmus wird eine umfassende Evaluation zur Protokollgenerierungsfunktionalität vorgestellt. Zusätzlich zu "precision" und "recall" wird ein neues Maß - Konfabulation (Engl.: "confabulation") - vorgestellt, das eine präzisere Charakterisierung der Qualität eines komplexen Sprachverarbeitungssystems ermöglicht

Universaar

Acronym

2nd Conference on Language, Data and Knowledge (LDK 2019), May 20–23, 2019, Leipzig, Germany

Author: Buitelaar Paul
Chiarcos Christian
de Melo Gerard
Dojchinovski Milan
Eskevich Maria
Fäth Christian
Klimek Bettina
McCrae John P.
Publication venue
Publication date: 27/04/2023
Field of study

OPUS Augsburg