658,158 research outputs found
Learning Fault-tolerant Speech Parsing with SCREEN
This paper describes a new approach and a system SCREEN for fault-tolerant
speech parsing. SCREEEN stands for Symbolic Connectionist Robust EnterprisE for
Natural language. Speech parsing describes the syntactic and semantic analysis
of spontaneous spoken language. The general approach is based on incremental
immediate flat analysis, learning of syntactic and semantic speech parsing,
parallel integration of current hypotheses, and the consideration of various
forms of speech related errors. The goal for this approach is to explore the
parallel interactions between various knowledge sources for learning
incremental fault-tolerant speech parsing. This approach is examined in a
system SCREEN using various hybrid connectionist techniques. Hybrid
connectionist techniques are examined because of their promising properties of
inherent fault tolerance, learning, gradedness and parallel constraint
integration. The input for SCREEN is hypotheses about recognized words of a
spoken utterance potentially analyzed by a speech system, the output is
hypotheses about the flat syntactic and semantic analysis of the utterance. In
this paper we focus on the general approach, the overall architecture, and
examples for learning flat syntactic speech parsing. Different from most other
speech language architectures SCREEN emphasizes an interactive rather than an
autonomous position, learning rather than encoding, flat analysis rather than
in-depth analysis, and fault-tolerant processing of phonetic, syntactic and
semantic knowledge.Comment: 6 pages, postscript, compressed, uuencoded to appear in Proceedings
of AAAI 9
Laying the Foundation for In-car Alcohol Detection by Speech
The fact that an increasing number of functions in the automobile are and will be controlled by speech of the driver rises the question whether this speech input may be used to detect a possible alcoholic intoxication of the driver. For that matter a large part of the new Alcohol Language Corpus (ALC) edited by the Bavarian Archive of Speech Signals (BAS) will be used for a broad statistical investigation of possible feature candidates for classification. In this contribution we present the motivation and the design of the ALC corpus as well as first results from fundamental
frequency and rhythm analysis. Our analysis by comparing
sober and alcoholized speech of the same individuals suggests that there are in fact promising features that can automatically be derived from the speech signal during the speech recognition process and will indicate intoxication for most speakers
Articulatory features for speech-driven head motion synthesis
This study investigates the use of articulatory features for speech-driven head motion synthesis as opposed to prosody features such as F0 and energy that have been mainly used in the literature. In the proposed approach, multi-stream HMMs are trained jointly on the synchronous streams of speech and head motion data. Articulatory features can be regarded as an intermediate parametrisation of speech that are expected to have a close link with head movement. Measured head and articulatory movements acquired by EMA were synchronously recorded with speech. Measured articulatory data was compared to those predicted from speech using an HMM-based inversion mapping system trained in a semi-supervised fashion. Canonical correlation analysis (CCA) on a data set of free speech of 12 people shows that the articulatory features are more correlated with head rotation than prosodic and/or cepstral speech features. It is also shown that the synthesised head motion using articulatory features gave higher correlations with the original head motion than when only prosodic features are used. Index Terms: head motion synthesis, articulatory features, canonical correlation analysis, acoustic-to-articulatory mappin
- …