5,225 research outputs found
Canonical Correlation Inference for Mapping Abstract Scenes to Text
We describe a technique for structured prediction, based on canonical
correlation analysis. Our learning algorithm finds two projections for the
input and the output spaces that aim at projecting a given input and its
correct output into points close to each other. We demonstrate our technique on
a language-vision problem, namely the problem of giving a textual description
to an "abstract scene".Comment: 10 pages, accepted to AAAI 201
Computational Modelling of Information Gathering
This thesis describes computational modelling of information gathering behaviour under active inference – a framework for describing Bayes optimal behaviour. Under active inference perception, attention and action all serve for same purpose: minimising variational free energy. Variational free energy is an upper bound on surprise and minimising it maximises an agent’s evidence for its survival. An agent achieves this by acquiring information (resolving uncertainty) about the hidden states of the world and uses the acquired information to act on the outcomes it prefers. In this work I placed special emphasis on the resolution of uncertainty about the states of the world. I first created a visual search task called scene construction task. In this task one needs to accumulate evidence for competing hypotheses (different visual scenes) through sequential sampling of a visual scene and categorising it once there is sufficient evidence. I showed that a computational agent attends to the most salient (epistemically valuable) locations in this task. In the next, this task was performed by healthy humans. Healthy people’s exploration strategies provided evidence for uncertainty driven exploration. I also showed how different exploratory behaviours can be characterised using canonical correlation analysis. In the next study I showed how exploration of a visual scene under different instructions could be explained by appealing to the computational mechanisms that may correspond to attention. This entailed manipulating the precision of task irrelevant cues and their hidden causes as a function of instructions. In the final work, I was interested in characterising impulsive behaviour using a patch leaving paradigm. By varying the parameters of the MDP model, I showed that there could be at least three distinct causes of impulsive behaviour, namely a lower depth of planning, a lower capacity to maintain and process information, and an increased perceived value of immediate rewards
Factorized Topic Models
In this paper we present a modification to a latent topic model, which makes
the model exploit supervision to produce a factorized representation of the
observed data. The structured parameterization separately encodes variance that
is shared between classes from variance that is private to each class by the
introduction of a new prior over the topic space. The approach allows for a
more eff{}icient inference and provides an intuitive interpretation of the data
in terms of an informative signal together with structured noise. The
factorized representation is shown to enhance inference performance for image,
text, and video classification.Comment: ICLR 201
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping
We introduce a model for bidirectional retrieval of images and sentences
through a multi-modal embedding of visual and natural language data. Unlike
previous models that directly map images or sentences into a common embedding
space, our model works on a finer level and embeds fragments of images
(objects) and fragments of sentences (typed dependency tree relations) into a
common space. In addition to a ranking objective seen in previous work, this
allows us to add a new fragment alignment objective that learns to directly
associate these fragments across modalities. Extensive experimental evaluation
shows that reasoning on both the global level of images and sentences and the
finer level of their respective fragments significantly improves performance on
image-sentence retrieval tasks. Additionally, our model provides interpretable
predictions since the inferred inter-modal fragment alignment is explicit
- …