152 research outputs found
Neural Expectation Maximization
Many real world tasks such as reasoning and physical interaction require
identification and manipulation of conceptual entities. A first step towards
solving these tasks is the automated discovery of distributed symbol-like
representations. In this paper, we explicitly formalize this problem as
inference in a spatial mixture model where each component is parametrized by a
neural network. Based on the Expectation Maximization framework we then derive
a differentiable clustering method that simultaneously learns how to group and
represent individual entities. We evaluate our method on the (sequential)
perceptual grouping task and find that it is able to accurately recover the
constituent objects. We demonstrate that the learned representations are useful
for next-step prediction.Comment: Accepted to NIPS 201
Sprite Learning and Object Category Recognition using Invariant Features
Institute for Adaptive and Neural ComputationThis thesis explores the use of invariant features for learning sprites from image sequences, and
for recognising object categories in images.
A popular framework for the interpretation of image sequences is the layers or sprite model
of e.g.Wang and Adelson (1994), Irani et al. (1994). Jojic and Frey (2001) provide a generative
probabilistic model framework for this task, but their algorithm is slow as it needs to search
over discretised transformations (e.g. translations, or affines) for each layer. We show that by
using invariant features (e.g. Lowe’s SIFT features) and clustering their motions we can reduce
or eliminate the search and thus learn the sprites much faster. The algorithm is demonstrated
on example image sequences.
We introduce the Generative Template of Features (GTF), a parts-based model for visual
object category detection. The GTF consists of a number of parts, and for each part there is
a corresponding spatial location distribution and a distribution over ‘visual words’ (clusters of
invariant features). We evaluate the performance of the GTF model for object localisation as
compared to other techniques, and show that such a relatively simple model can give state-of-
the-art performance. We also discuss the connection of the GTF to Hough-transform-like
methods for object localisation
Neural Diagrammatic Reasoning
Diagrams have been shown to be effective tools for humans to represent and reason about
complex concepts. They have been widely used to represent concepts in science teaching, to
communicate workflow in industries and to measure human fluid intelligence. Mechanised
reasoning systems typically encode diagrams into symbolic representations that can be
easily processed with rule-based expert systems. This relies on human experts to define the
framework of diagram-to-symbol mapping and the set of rules to reason with the symbols.
This means the reasoning systems cannot be easily adapted to other diagrams without
a new set of human-defined representation mapping and reasoning rules. Moreover such
systems are not able to cope with diagram inputs as raw and possibly noisy images. The
need for human input and the lack of robustness to noise significantly limit the applications
of mechanised diagrammatic reasoning systems.
A key research question then arises: can we develop human-like reasoning systems that
learn to reason robustly without predefined reasoning rules? To answer this question, I
propose Neural Diagrammatic Reasoning, a new family of diagrammatic reasoning
systems which does not have the drawbacks of mechanised reasoning systems. The new
systems are based on deep neural networks, a recently popular machine learning method
that achieved human-level performance on a range of perception tasks such as object
detection, speech recognition and natural language processing. The proposed systems are
able to learn both diagram to symbol mapping and implicit reasoning rules only from data,
with no prior human input about symbols and rules in the reasoning tasks. Specifically I
developed EulerNet, a novel neural network model that solves Euler diagram syllogism
tasks with 99.5% accuracy. Experiments show that EulerNet learns useful representations
of the diagrams and tasks, and is robust to noise and deformation in the input data. I
also developed MXGNet, a novel multiplex graph neural architecture that solves Raven
Progressive Matrices (RPM) tasks. MXGNet achieves state-of-the-art accuracies on two
popular RPM datasets. In addition, I developed Discrete-AIR, an unsupervised learning
architecture that learns semi-symbolic representations of diagrams without any labels.
Lastly I designed a novel inductive bias module that can be readily used in today’s deep
neural networks to improve their generalisation capability on relational reasoning tasks.EPSRC Studentship and Cambridge Trust Scholarshi
- …