5,388 research outputs found
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialogue acts in
conversational speech, i.e., speech-act-like units such as Statement, Question,
Backchannel, Agreement, Disagreement, and Apology. Our model detects and
predicts dialogue acts based on lexical, collocational, and prosodic cues, as
well as on the discourse coherence of the dialogue act sequence. The dialogue
model is based on treating the discourse structure of a conversation as a
hidden Markov model and the individual dialogue acts as observations emanating
from the model states. Constraints on the likely sequence of dialogue acts are
modeled via a dialogue act n-gram. The statistical dialogue grammar is combined
with word n-grams, decision trees, and neural networks modeling the
idiosyncratic lexical and prosodic manifestations of each dialogue act. We
develop a probabilistic integration of speech recognition with dialogue
modeling, to improve both speech recognition and dialogue act classification
accuracy. Models are trained and evaluated using a large hand-labeled database
of 1,155 conversations from the Switchboard corpus of spontaneous
human-to-human telephone speech. We achieved good dialogue act labeling
accuracy (65% based on errorful, automatically recognized words and prosody,
and 71% based on word transcripts, compared to a chance baseline accuracy of
35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling
changed
Conversational Sensing
Recent developments in sensing technologies, mobile devices and context-aware
user interfaces have made it possible to represent information fusion and
situational awareness as a conversational process among actors - human and
machine agents - at or near the tactical edges of a network. Motivated by use
cases in the domain of security, policing and emergency response, this paper
presents an approach to information collection, fusion and sense-making based
on the use of natural language (NL) and controlled natural language (CNL) to
support richer forms of human-machine interaction. The approach uses a
conversational protocol to facilitate a flow of collaborative messages from NL
to CNL and back again in support of interactions such as: turning eyewitness
reports from human observers into actionable information (from both trained and
untrained sources); fusing information from humans and physical sensors (with
associated quality metadata); and assisting human analysts to make the best use
of available sensing assets in an area of interest (governed by management and
security policies). CNL is used as a common formal knowledge representation for
both machine and human agents to support reasoning, semantic information fusion
and generation of rationale for inferences, in ways that remain transparent to
human users. Examples are provided of various alternative styles for user
feedback, including NL, CNL and graphical feedback. A pilot experiment with
human subjects shows that a prototype conversational agent is able to gather
usable CNL information from untrained human subjects
- …