796 research outputs found
Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information
In conversational speech, the acoustic signal provides cues that help
listeners disambiguate difficult parses. For automatically parsing spoken
utterances, we introduce a model that integrates transcribed text and
acoustic-prosodic features using a convolutional neural network over energy and
pitch trajectories coupled with an attention-based recurrent neural network
that accepts text and prosodic features. We find that different types of
acoustic-prosodic features are individually helpful, and together give
statistically significant improvements in parse and disfluency detection F1
scores over a strong text-only baseline. For this study with known sentence
boundaries, error analyses show that the main benefit of acoustic-prosodic
features is in sentences with disfluencies, attachment decisions are most
improved, and transcription errors obscure gains from prosody.Comment: Accepted in NAACL HLT 201
SCREEN: Learning a Flat Syntactic and Semantic Spoken Language Analysis Using Artificial Neural Networks
In this paper, we describe a so-called screening approach for learning robust
processing of spontaneously spoken language. A screening approach is a flat
analysis which uses shallow sequences of category representations for analyzing
an utterance at various syntactic, semantic and dialog levels. Rather than
using a deeply structured symbolic analysis, we use a flat connectionist
analysis. This screening approach aims at supporting speech and language
processing by using (1) data-driven learning and (2) robustness of
connectionist networks. In order to test this approach, we have developed the
SCREEN system which is based on this new robust, learned and flat analysis.
In this paper, we focus on a detailed description of SCREEN's architecture,
the flat syntactic and semantic analysis, the interaction with a speech
recognizer, and a detailed evaluation analysis of the robustness under the
influence of noisy or incomplete input. The main result of this paper is that
flat representations allow more robust processing of spontaneous spoken
language than deeply structured representations. In particular, we show how the
fault-tolerance and learning capability of connectionist networks can support a
flat analysis for providing more robust spoken-language processing within an
overall hybrid symbolic/connectionist framework.Comment: 51 pages, Postscript. To be published in Journal of Artificial
Intelligence Research 6(1), 199
Incremental Interpretation: Applications, Theory, and Relationship to Dynamic Semantics
Why should computers interpret language incrementally? In recent years
psycholinguistic evidence for incremental interpretation has become more and
more compelling, suggesting that humans perform semantic interpretation before
constituent boundaries, possibly word by word. However, possible computational
applications have received less attention. In this paper we consider various
potential applications, in particular graphical interaction and dialogue. We
then review the theoretical and computational tools available for mapping from
fragments of sentences to fully scoped semantic representations. Finally, we
tease apart the relationship between dynamic semantics and incremental
interpretation.Comment: Procs. of COLING 94, LaTeX (2.09 preferred), 8 page
Spoken language processing in the hybrid connectionist architecture SCREEN
In this paper we describe a robust, learning approach to spoken language understanding. Since interactively spoken and computationally analyzed language often contains many errors, robust connectionist networks are used for providing a flat screening analysis. A screening analysis is a shallow flat analysis based on category sequences at various syntactic, semantic and dialog levels. Rather than using tree or graph representations a screening analysis uses category sequences in order to support robustness and learning. This flat screening analysis is examined in the context of the system SCREEN (Symbolic Connectionist Robust EnterprisE for Natural language). Starting with the word hypotheses generated by a speech recognizer, we give an overview of the architecture, and illustrate the flat robust processing at the levels of syntax, semantics, and dialog acts. While early
connectionist models were often limited to a single network and a small task, the hybrid connectionist SCREEN system is an important step towards exploring connectionist techniques in larger hybrid symbolic/connectionist environments and for real-world problemsBased on our experience with SCREEN, hybrid connectionist techniques show a lot of potential for supporting robustness in interactive spoken language processing
- …