6,508 research outputs found
From Monologue to Dialogue: Natural Language Generation in OVIS
This paper describes how a language generation system that was originally designed for monologue generation, has been adapted for use in the OVIS spoken dialogue system. To meet the requirement that in a dialogue, the system's utterances should make up a single, coherent dialogue turn, several modifications had to be made to the system. The paper also discusses the influence of dialogue context on information status, and its consequences for the generation of referring expressions and accentuation
Reasoning About Pragmatics with Neural Listeners and Speakers
We present a model for pragmatically describing scenes, in which contrastive
behavior results from a combination of inference-driven pragmatics and learned
semantics. Like previous learned approaches to language generation, our model
uses a simple feature-driven architecture (here a pair of neural "listener" and
"speaker" models) to ground language in the world. Like inference-driven
approaches to pragmatics, our model actively reasons about listener behavior
when selecting utterances. For training, our approach requires only ordinary
captions, annotated _without_ demonstration of the pragmatic behavior the model
ultimately exhibits. In human evaluations on a referring expression game, our
approach succeeds 81% of the time, compared to a 69% success rate using
existing techniques
Augmenting Situated Spoken Language Interaction with Listener Gaze
Collaborative task solving in a shared environment requires referential success. Human speakers follow the listener’s behavior in order to monitor language comprehension (Clark, 1996). Furthermore, a natural language generation (NLG) system can exploit listener gaze to realize an effective interaction strategy by responding to it with verbal feedback in virtual environments (Garoufi, Staudte, Koller, & Crocker, 2016). We augment situated spoken language interaction with listener gaze and investigate its role in human-human and human-machine interactions. Firstly, we evaluate its impact on prediction of reference resolution using a mulitimodal corpus collection from virtual environments. Secondly, we explore if and how a human speaker uses listener gaze in an indoor guidance task, while spontaneously referring to real-world objects in a real environment. Thirdly, we consider an object identification task for assembly under system instruction. We developed a multimodal interactive system and two NLG systems that integrate listener gaze in the generation mechanisms. The NLG system “Feedback” reacts to gaze with verbal feedback, either underspecified or contrastive. The NLG system “Installments” uses gaze to incrementally refer to an object in the form of installments. Our results showed that gaze features improved the accuracy of automatic prediction of reference resolution. Further, we found that human speakers are very good at producing referring expressions, and showing listener gaze did not improve performance, but elicited more negative feedback. In contrast, we showed that an NLG system that exploits listener gaze benefits the listener’s understanding. Specifically, combining a short, ambiguous instruction with con- trastive feedback resulted in faster interactions compared to underspecified feedback, and even outperformed following long, unambiguous instructions. Moreover, alternating the underspecified and contrastive responses in an interleaved manner led to better engagement with the system and an effcient information uptake, and resulted in equally good performance. Somewhat surprisingly, when gaze was incorporated more indirectly in the generation procedure and used to trigger installments, the non-interactive approach that outputs an instruction all at once was more effective. However, if the spatial expression was mentioned first, referring in gaze-driven installments was as efficient as following an exhaustive instruction. In sum, we provide a proof of concept that listener gaze can effectively be used in situated human-machine interaction. An assistance system using gaze cues is more attentive and adapts to listener behavior to ensure communicative success
Incremental generation of plural descriptions : similarity and partitioning
Approaches to plural reference generation
emphasise descriptive brevity, but often lack
empirical backing. This paper describes
a corpus-based study of plural descriptions,
and proposes a psycholinguisticallymotivated
algorithm for plural reference
generation. The descriptive strategy is based
on partitioning and incorporates corpusderived
heuristics. An exhaustive evaluation
shows that the output closely matches human
data.peer-reviewe
- …