364 research outputs found
Modelling Users, Intentions, and Structure in Spoken Dialog
We outline how utterances in dialogs can be interpreted using a partial first
order logic. We exploit the capability of this logic to talk about the truth
status of formulae to define a notion of coherence between utterances and
explain how this coherence relation can serve for the construction of AND/OR
trees that represent the segmentation of the dialog. In a BDI model we
formalize basic assumptions about dialog and cooperative behaviour of
participants. These assumptions provide a basis for inferring speech acts from
coherence relations between utterances and attitudes of dialog participants.
Speech acts prove to be useful for determining dialog segments defined on the
notion of completing expectations of dialog participants. Finally, we sketch
how explicit segmentation signalled by cue phrases and performatives is covered
by our dialog model.Comment: 17 page
Combining Expression and Content in Domains for Dialog Managers
We present work in progress on abstracting dialog managers from their domain
in order to implement a dialog manager development tool which takes (among
other data) a domain description as input and delivers a new dialog manager for
the described domain as output. Thereby we will focus on two topics; firstly,
the construction of domain descriptions with description logics and secondly,
the interpretation of utterances in a given domain.Comment: 5 pages, uses conference.st
Semantic Processing of Out-Of-Vocabulary Words in a Spoken Dialogue System
One of the most important causes of failure in spoken dialogue systems is
usually neglected: the problem of words that are not covered by the system's
vocabulary (out-of-vocabulary or OOV words). In this paper a methodology is
described for the detection, classification and processing of OOV words in an
automatic train timetable information system. The various extensions that had
to be effected on the different modules of the system are reported, resulting
in the design of appropriate dialogue strategies, as are encouraging evaluation
results on the new versions of the word recogniser and the linguistic
processor.Comment: 4 pages, 2 eps figures, requires LaTeX2e, uses eurospeech.sty and
epsfi
Rule based replication strategy for heterogeneous, autonomous information systems
Bei der regelbasierten Replikationsstrategie RegRess erfolgt die Koordination der Schreib- und Lesezugriffe auf die Replikate mittels Replikationsregeln. Diese Regeln werden in der eigens entwickelten Regelsprache RRML formuliert, wobei fachliche und technische Anforderungen berücksichtigt werden können. Vor jedem Zugriff auf die Replikate wird eine Inferenz dieser Regeln durchgeführt, um die betroffenen Replikate zu bestimmen. Dadurch wird unterschiedlichstes Konsistenzverhalten von RegRess realisiert, insbesondere werden temporäre Inkonsistenzen toleriert. Eine Regelmenge mit für einen Anwendungsfall spezifizierten Regeln bildet die Konfiguration von RegRess. Weil in den Regeln Systemzustände berücksichtigt werden können, kann zur Laufzeit das Verhalten angepasst werden. Somit handelt es sich bei RegRess um eine konfigurierbare, adaptive Replikationsstrategie. Zur Realisierung von RegRess dient der Replikationsmanager KARMA, der einen Regelinterpreter für die RRML beinhaltet.At the rule based replication strategy RegRess the coordination of the write and read accesses is carried out on the replicas by means of replication rules. These rules are formulated in the specifically developed rule language RRML, in which functional and technical requirements can be taken into account. An inference of these rules is carried out in front of every access to the replicas to determine the replicas concerned. The most different consistency behaviour is realized by recourse through this, temporary inconsistencies particularly are tolerated. An amount of rule with rules specified for an application case forms the configuration of RegRess. Because in the rules system states can be taken into account, the behaviour can be adapted to the running time. Therefore RegRess is a configurable, adaptive replication strategy. The replication manager KARMA who contains a rule interpreter for the RRML serves for the realization of RegRess
Topic spotting using subword units
In this paper we present a new approach for topic spotting based on subword units and feature vectors instead of words. In our first approach, we only use vector quantized feature vectors and polygram language models for topic representation. In the second approach, we use phonemes instead of the vector quantized feature vectors and model the topics again using polygram language models. We trained and tested the two methods on two different corpora. The first is a part of a media corpus which contains data from TV shows for three different topics. The second is the VERBMOBIL-corpus where we used 18 dialog acts as topics. Each corpus was splitted into disjunctive test and training sets. We achieved recognition rates up to 82% for the three topics of the media corpus and up to 64% using 18 dialog acts of the VERBMOBIL-corpus as topics
Syntactic-prosodic labeling of large spontaneous speech data-bases
In automatic speech understanding, the division of continuously running speech into syntactic chunks is a great problem. Syntactic boundaries are often marked by prosodic means. For the training of statistic models for prosodic boundaries large databases are necessary. For the German Verbmobil project (automatic speech-to-speech translation), we developed a syntactic-prosodic labeling scheme where two main types of boundaries (major syntactic boundaries and syntactically ambiguous boundaries) and some other special boundaries are labeled for a large Verbmobil spontaneous speech corpus. We compare the results of classifiers (multilayer perceptrons and language models) trained on these syntactic-prosodic boundary labels with classifiers trained on perceptual-prosodic and pure syntactic labels. The main advantage of the rough syntactic-prosodic labels presented in this paper is that large amounts of data could be labeled within a short time. Therefore, the classifiers trained with these labels turned out to be superior (recognition rates of up to 96%)
Detection of phrase boundaries and accents
On a large speech database read by untrained speakers experiments for the recognition of phrase boundaries and phrase accents were performed. We used durational features as well as features derived from pitch and energy contours and pause information. Different sets of features were compared. For distinguishing three different boundary classes a recognition rate of 75.7% and for distinguishing accentuated from unaccentuated syllables a recognition rate of 88.7% could be achieved
Prosodic processing and its use in Verbmobil
We present the prosody module of the VERBMOBlL speech-to-speech translation system, the world wide first complete system, which successfully uses prosodic information in the linguistic analysis. This is achieved by computing probabilities for clause boundaries, accentuation, and different types of sentence mood for each of the word hypotheses computed by the word recognizer. These probabilities guide the search of the linguistic analysis. Disambiguation is already achieved during the analysis and not by a prosodic verification of different linguistic hypotheses. So far, the most useful prosodic information is provided by clause boundaries. These are detected with a recognition rate of 94%. For the parsing of word hypotheses graphs, the use of clause boundary probabilities yields a speed-up of 92% and a 96% reduction of alternative readings
Dialog act classification with the help of prosody
This paper presents automatic methods for the segmentation and classication of dialog acts (DA). In Verbmobil it is often sufficient to recognize the sequence of DAs occurring during a dialog between the two partners. Since a turn can consist of one or more successive DAs we conduct the classification of DAs in a two step procedure: First each turn has to be segmented into units which correspond to a DA and second the DA categories have to be identified. For the segmentation we use polygrams and multi -layer perceptrons, using prosodic features. The classification of DAs is done with semantic classication trees and polygrams
- …