Search CORE

58,878 research outputs found

Towards responsive Sensitive Artificial Listeners

Author: Cowie Roddy
Heylen Dirk
Pantic Maja
Pelachaud Catherine
Schröder Marc
Schuller Björn
Publication venue: University of Sheffield
Publication date: 01/01/2008
Field of study

This paper describes work in the recently started project SEMAINE, which aims to build a set of Sensitive Artificial Listeners – conversational agents designed to sustain an interaction with a human user despite limited verbal skills, through robust recognition and generation of non-verbal behaviour in real-time, both when the agent is speaking and listening. We report on data collection and on the design of a system architecture in view of real-time responsiveness

CiteSeerX

University of Twente Research Information

Integration of multimodal data based on surface registration

Author: Ferré Bergadà Maria
Tost Pardell Daniela
Publication venue
Publication date: 01/01/2000
Field of study

The paper proposes and evaluates a strategy for the alignment of anatomical and functional data of the brain. The method takes as an input two different sets of images of a same patient: MR data and SPECT. It proceeds in four steps: first, it constructs two voxel models from the two image sets; next, it extracts from the two voxel models the surfaces of regions of interest; in the third step, the surfaces are interactively aligned by corresponding pairs; finally a unique volume model is constructed by selectively applying the geometrical transformations associated to the regions and weighting their contributions. The main advantages of this strategy are (i) that it can be applied retrospectively, (ii) that it is tri-dimensional, and (iii) that it is local. Its main disadvantage with regard to previously published methods it that it requires the extraction of surfaces. However, this step is often required for other stages of the multimodal analysis such as the visualization and therefore its cost can be accounted in the global cost of the process.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Learning Multi-Modal Word Representation Grounded in Visual Context

Author: Gallinari Patrick
Piwowarski Benjamin
Soulier Laure
Zablocki Éloi
Publication venue
Publication date: 09/11/2017
Field of study

Representing the semantics of words is a long-standing problem for the natural language processing community. Most methods compute word semantics given their textual context in large corpora. More recently, researchers attempted to integrate perceptual and visual features. Most of these works consider the visual appearance of objects to enhance word representations but they ignore the visual environment and context in which objects appear. We propose to unify text-based techniques with vision-based techniques by simultaneously leveraging textual and visual context to learn multimodal word embeddings. We explore various choices for what can serve as a visual context and present an end-to-end method to integrate visual context elements in a multimodal skip-gram model. We provide experiments and extensive analysis of the obtained results

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications