309 research outputs found
Utilizing Statistical Dialogue Act Processing in Verbmobil
In this paper, we present a statistical approach for dialogue act processing
in the dialogue component of the speech-to-speech translation system \vm.
Statistics in dialogue processing is used to predict follow-up dialogue acts.
As an application example we show how it supports repair when unexpected
dialogue states occur.Comment: 6 pages; compressed and uuencoded postscript file; to appear in
ACL-9
An integrated architecture for shallow and deep processing
We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we describe the integration of a high-level HPSG parsing system with different high-performance shallow components, ranging from named entity recognition to chunk parsing and shallow clause recognition. The NLP components enrich a representation of natural language text with layers of new XML meta-information using a single shared data structure, called the text chart. We describe details of the integration methods, and show how information extraction and language checking applications for realworld German text benefit from a deep grammatical analysis
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialogue acts in
conversational speech, i.e., speech-act-like units such as Statement, Question,
Backchannel, Agreement, Disagreement, and Apology. Our model detects and
predicts dialogue acts based on lexical, collocational, and prosodic cues, as
well as on the discourse coherence of the dialogue act sequence. The dialogue
model is based on treating the discourse structure of a conversation as a
hidden Markov model and the individual dialogue acts as observations emanating
from the model states. Constraints on the likely sequence of dialogue acts are
modeled via a dialogue act n-gram. The statistical dialogue grammar is combined
with word n-grams, decision trees, and neural networks modeling the
idiosyncratic lexical and prosodic manifestations of each dialogue act. We
develop a probabilistic integration of speech recognition with dialogue
modeling, to improve both speech recognition and dialogue act classification
accuracy. Models are trained and evaluated using a large hand-labeled database
of 1,155 conversations from the Switchboard corpus of spontaneous
human-to-human telephone speech. We achieved good dialogue act labeling
accuracy (65% based on errorful, automatically recognized words and prosody,
and 71% based on word transcripts, compared to a chance baseline accuracy of
35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling
changed
Prosodic modules for speech recognition and understanding in VERBMOBIL
Within VERBMOBIL, a large project on spoken language research in Germany, two modules for detecting and recognizing prosodic events have been developed. One module operates on speech signal parameters and the word hypothesis graph, whereas the other module, designed for a novel, highly interactive architecture, only uses speech signal parameters as its input. Phrase boundaries, sentence modality, and accents are detected. The recognition rates in spontaneous dialogs are for accents up to 82,5%, for phrase boundaries up to 91,7%
Parsing of Spoken Language under Time Constraints
Spoken language applications in natural dialogue settings place serious
requirements on the choice of processing architecture. Especially under adverse
phonetic and acoustic conditions parsing procedures have to be developed which
do not only analyse the incoming speech in a time-synchroneous and incremental
manner, but which are able to schedule their resources according to the varying
conditions of the recognition process. Depending on the actual degree of local
ambiguity the parser has to select among the available constraints in order to
narrow down the search space with as little effort as possible.
A parsing approach based on constraint satisfaction techniques is discussed.
It provides important characteristics of the desired real-time behaviour and
attempts to mimic some of the attention focussing capabilities of the human
speech comprehension mechanism.Comment: 19 pages, LaTe
Some experiments in speech act prediction
In this paper, we present a statistical approach for speech act prediction in the dialogue component of the speech-to-speech translation system Verbmobil. The prediction algorithm is based on work known from language modelling and uses N-gram information computed from a training corpus. We demonstrate the performance of this method with 10 experiments. These experiments vary in two dimensions, namely whether the N-gram information is updated while processing, and whether deviations from the standard dialogue structure are processed. Six of the experiments use complete dialogues, while four process only the speech acts of one dialogue partner. It is shown that the predictions are best when using the update feature and deviations are not processed. Even the processing of incomplete dialogues then yields acceptable results. Another experiment shows that a training corpus size of about 40 dialogues is sufficient for the prediction task, and that the structure of the dialogues of the Verbmobil corpus we use differs remarkably with respect to the predictions
Semantic transfer in Verbmobil
This paper is a detailed discussion of semantic transfer in the context of the Verbmobil Machine Translation project. The use of semantic transfer as a translation mechanism is introduced and justified by comparison with alternative approaches. Some criteria for evaluation of transfer frameworks are discussed and a comparison is made of three different approaches to the representation of translation rules or equivalences. This is followed by a discussion of control of application of transfer rules and interaction with a domain description and inference component
- …