7,692 research outputs found
Modality Choice for Generation of Referring Acts: Pointing versus Describing
The main aim of this paper is to challenge two commonly held assumptions regarding modality selection in the generation of referring acts: the assumption that non-verbal means of referring are secondary to verbal ones, and the assumption that there is a single strategy that speakers follow for generating referring acts. Our evidence is drawn from a corpus of task-oriented dialogues that was obtained through an observational study. We propose two alternative strategies for modality selection based on correlation data from the observational study. Speakers that follow the first strategy simply abstain from pointing. Speakers that follow the other strategy make the decision whether to point dependent on whether the intended referent is in focus and/or important. This decision precedes the selection of verbal means (i.e., words) for referring
Evaluating the Representational Hub of Language and Vision Models
The multimodal models used in the emerging field at the intersection of
computational linguistics and computer vision implement the bottom-up
processing of the `Hub and Spoke' architecture proposed in cognitive science to
represent how the brain processes and combines multi-sensory inputs. In
particular, the Hub is implemented as a neural network encoder. We investigate
the effect on this encoder of various vision-and-language tasks proposed in the
literature: visual question answering, visual reference resolution, and
visually grounded dialogue. To measure the quality of the representations
learned by the encoder, we use two kinds of analyses. First, we evaluate the
encoder pre-trained on the different vision-and-language tasks on an existing
diagnostic task designed to assess multimodal semantic understanding. Second,
we carry out a battery of analyses aimed at studying how the encoder merges and
exploits the two modalities.Comment: Accepted to IWCS 201
A Review of Verbal and Non-Verbal Human-Robot Interactive Communication
In this paper, an overview of human-robot interactive communication is
presented, covering verbal as well as non-verbal aspects of human-robot
interaction. Following a historical introduction, and motivation towards fluid
human-robot communication, ten desiderata are proposed, which provide an
organizational axis both of recent as well as of future research on human-robot
communication. Then, the ten desiderata are examined in detail, culminating to
a unifying discussion, and a forward-looking conclusion
Meetings and Meeting Modeling in Smart Environments
In this paper we survey our research on smart meeting rooms and its relevance for augmented reality meeting support and virtual reality generation of meetings in real time or off-line. The research reported here forms part of the European 5th and 6th framework programme projects multi-modal meeting manager (M4) and augmented multi-party interaction (AMI). Both projects aim at building a smart meeting environment that is able to collect multimodal captures of the activities and discussions in a meeting room, with the aim to use this information as input to tools that allow real-time support, browsing, retrieval and summarization of meetings. Our aim is to research (semantic) representations of what takes place during meetings in order to allow generation, e.g. in virtual reality, of meeting activities (discussions, presentations, voting, etc.). Being able to do so also allows us to look at tools that provide support during a meeting and at tools that allow those not able to be physically present during a meeting to take part in a virtual way. This may lead to situations where the differences between real meeting participants, human-controlled virtual participants and (semi-) autonomous virtual participants disappear
Individual and Domain Adaptation in Sentence Planning for Dialogue
One of the biggest challenges in the development and deployment of spoken
dialogue systems is the design of the spoken language generation module. This
challenge arises from the need for the generator to adapt to many features of
the dialogue domain, user population, and dialogue context. A promising
approach is trainable generation, which uses general-purpose linguistic
knowledge that is automatically adapted to the features of interest, such as
the application domain, individual user, or user group. In this paper we
present and evaluate a trainable sentence planner for providing restaurant
information in the MATCH dialogue system. We show that trainable sentence
planning can produce complex information presentations whose quality is
comparable to the output of a template-based generator tuned to this domain. We
also show that our method easily supports adapting the sentence planner to
individuals, and that the individualized sentence planners generally perform
better than models trained and tested on a population of individuals. Previous
work has documented and utilized individual preferences for content selection,
but to our knowledge, these results provide the first demonstration of
individual preferences for sentence planning operations, affecting the content
order, discourse structure and sentence structure of system responses. Finally,
we evaluate the contribution of different feature sets, and show that, in our
application, n-gram features often do as well as features based on higher-level
linguistic representations
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
- …