Search CORE

5,225 research outputs found

Canonical Correlation Inference for Mapping Abstract Scenes to Text

Author: Cohen Shay
Jiang Helen
Papasarantopoulos Nikolaos
Publication venue
Publication date: 17/11/2017
Field of study

We describe a technique for structured prediction, based on canonical correlation analysis. Our learning algorithm finds two projections for the input and the output spaces that aim at projecting a given input and its correct output into points close to each other. We demonstrate our technique on a language-vision problem, namely the problem of giving a textual description to an "abstract scene".Comment: 10 pages, accepted to AAAI 201

arXiv.org e-Print Archive

Edinburgh Research Explorer

Association for the Advancement of Artificial Intelligence: AAAI Publications

Computational Modelling of Information Gathering

Author: Mirza Muammer Berk
Publication venue: UCL (University College London)
Publication date: 28/02/2019
Field of study

This thesis describes computational modelling of information gathering behaviour under active inference – a framework for describing Bayes optimal behaviour. Under active inference perception, attention and action all serve for same purpose: minimising variational free energy. Variational free energy is an upper bound on surprise and minimising it maximises an agent’s evidence for its survival. An agent achieves this by acquiring information (resolving uncertainty) about the hidden states of the world and uses the acquired information to act on the outcomes it prefers. In this work I placed special emphasis on the resolution of uncertainty about the states of the world. I first created a visual search task called scene construction task. In this task one needs to accumulate evidence for competing hypotheses (different visual scenes) through sequential sampling of a visual scene and categorising it once there is sufficient evidence. I showed that a computational agent attends to the most salient (epistemically valuable) locations in this task. In the next, this task was performed by healthy humans. Healthy people’s exploration strategies provided evidence for uncertainty driven exploration. I also showed how different exploratory behaviours can be characterised using canonical correlation analysis. In the next study I showed how exploration of a visual scene under different instructions could be explained by appealing to the computational mechanisms that may correspond to attention. This entailed manipulating the precision of task irrelevant cues and their hidden causes as a function of instructions. In the final work, I was interested in characterising impulsive behaviour using a patch leaving paradigm. By varying the parameters of the MDP model, I showed that there could be at least three distinct causes of impulsive behaviour, namely a lower depth of planning, a lower capacity to maintain and process information, and an increased perceived value of immediate rewards

UCL Discovery

Factorized Topic Models

Author: Damianou Andreas
Ek Carl Henrik
Kjellstrom Hedvig
Zhang Cheng
Publication venue
Publication date: 01/01/2013
Field of study

In this paper we present a modification to a latent topic model, which makes the model exploit supervision to produce a factorized representation of the observed data. The structured parameterization separately encodes variance that is shared between classes from variance that is private to each class by the introduction of a new prior over the topic space. The approach allows for a more eff{}icient inference and provides an intuitive interpretation of the data in terms of an informative signal together with structured noise. The factorized representation is shown to enhance inference performance for image, text, and video classification.Comment: ICLR 201

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Explore Bristol Research

Data-Driven Shape Analysis and Processing

Author: Huang Qixing
Kalogerakis Evangelos
Kim Vladimir G.
Xu Kai
Publication venue
Publication date: 23/02/2015
Field of study

Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

arXiv.org e-Print Archive

CiteSeerX

Deep Fragment Embeddings for Bidirectional Image Sentence Mapping

Author: Fei-Fei Li
Joulin Armand
Karpathy Andrej
Publication venue
Publication date: 22/06/2014
Field of study

We introduce a model for bidirectional retrieval of images and sentences through a multi-modal embedding of visual and natural language data. Unlike previous models that directly map images or sentences into a common embedding space, our model works on a finer level and embeds fragments of images (objects) and fragments of sentences (typed dependency tree relations) into a common space. In addition to a ranking objective seen in previous work, this allows us to add a new fragment alignment objective that learns to directly associate these fragments across modalities. Extensive experimental evaluation shows that reasoning on both the global level of images and sentences and the finer level of their respective fragments significantly improves performance on image-sentence retrieval tasks. Additionally, our model provides interpretable predictions since the inferred inter-modal fragment alignment is explicit

arXiv.org e-Print Archive

CiteSeerX