Search CORE

796 research outputs found

Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information

Author: Bansal Mohit
Gimpel Kevin
Livescu Karen
Ostendorf Mari
Toshniwal Shubham
Tran Trang
Publication venue
Publication date: 01/01/2018
Field of study

In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses. For automatically parsing spoken utterances, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and prosodic features. We find that different types of acoustic-prosodic features are individually helpful, and together give statistically significant improvements in parse and disfluency detection F1 scores over a strong text-only baseline. For this study with known sentence boundaries, error analyses show that the main benefit of acoustic-prosodic features is in sentences with disfluencies, attachment decisions are most improved, and transcription errors obscure gains from prosody.Comment: Accepted in NAACL HLT 201

arXiv.org e-Print Archive

Crossref

SCREEN: Learning a Flat Syntactic and Semantic Spoken Language Analysis Using Artificial Neural Networks

Author: Weber Volker
Wermter Stefan
Publication venue
Publication date: 31/12/1996
Field of study

In this paper, we describe a so-called screening approach for learning robust processing of spontaneously spoken language. A screening approach is a flat analysis which uses shallow sequences of category representations for analyzing an utterance at various syntactic, semantic and dialog levels. Rather than using a deeply structured symbolic analysis, we use a flat connectionist analysis. This screening approach aims at supporting speech and language processing by using (1) data-driven learning and (2) robustness of connectionist networks. In order to test this approach, we have developed the SCREEN system which is based on this new robust, learned and flat analysis. In this paper, we focus on a detailed description of SCREEN's architecture, the flat syntactic and semantic analysis, the interaction with a speech recognizer, and a detailed evaluation analysis of the robustness under the influence of noisy or incomplete input. The main result of this paper is that flat representations allow more robust processing of spontaneous spoken language than deeply structured representations. In particular, we show how the fault-tolerance and learning capability of connectionist networks can support a flat analysis for providing more robust spoken-language processing within an overall hybrid symbolic/connectionist framework.Comment: 51 pages, Postscript. To be published in Journal of Artificial Intelligence Research 6(1), 199

arXiv.org e-Print Archive

CiteSeerX

Universaar

Acronym

Incremental Interpretation: Applications, Theory, and Relationship to Dynamic Semantics

Author: Cooper Robin
Milward David
Publication venue
Publication date: 01/01/1994
Field of study

Why should computers interpret language incrementally? In recent years psycholinguistic evidence for incremental interpretation has become more and more compelling, suggesting that humans perform semantic interpretation before constituent boundaries, possibly word by word. However, possible computational applications have received less attention. In this paper we consider various potential applications, in particular graphical interaction and dialogue. We then review the theoretical and computational tools available for mapping from fragments of sentences to fully scoped semantic representations. Finally, we tease apart the relationship between dynamic semantics and incremental interpretation.Comment: Procs. of COLING 94, LaTeX (2.09 preferred), 8 page

arXiv.org e-Print Archive

CiteSeerX

Spoken language processing in the hybrid connectionist architecture SCREEN

Author: Weber Volker
Wermter Stefan
Publication venue: Sonstige Einrichtungen. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Publication date: 01/01/1996
Field of study

In this paper we describe a robust, learning approach to spoken language understanding. Since interactively spoken and computationally analyzed language often contains many errors, robust connectionist networks are used for providing a flat screening analysis. A screening analysis is a shallow flat analysis based on category sequences at various syntactic, semantic and dialog levels. Rather than using tree or graph representations a screening analysis uses category sequences in order to support robustness and learning. This flat screening analysis is examined in the context of the system SCREEN (Symbolic Connectionist Robust EnterprisE for Natural language). Starting with the word hypotheses generated by a speech recognizer, we give an overview of the architecture, and illustrate the flat robust processing at the levels of syntax, semantics, and dialog acts. While early connectionist models were often limited to a single network and a small task, the hybrid connectionist SCREEN system is an important step towards exploring connectionist techniques in larger hybrid symbolic/connectionist environments and for real-world problemsBased on our experience with SCREEN, hybrid connectionist techniques show a lot of potential for supporting robustness in interactive spoken language processing

CiteSeerX

Universaar

Acronym

Recent advances in Janus: a speech translation system

Author: [u.a.] Alex
Coccaro Noah
Eisele Andreas
Mcnair A.
Rogina Ivica
Sloboda Tilo
Waibel Alex
Woszczyna Monika
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

How to repair speech repairs in an end-to-end system

Author: Batliner Anton
Nöth Elmar
Spilker Jörg
Publication venue
Publication date: 29/01/2020
Field of study

OPUS Augsburg