Search CORE

152 research outputs found

Neural Expectation Maximization

Author: Greff Klaus
Schmidhuber Jürgen
van Steenkiste Sjoerd
Publication venue
Publication date: 04/11/2017
Field of study

Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities. A first step towards solving these tasks is the automated discovery of distributed symbol-like representations. In this paper, we explicitly formalize this problem as inference in a spatial mixture model where each component is parametrized by a neural network. Based on the Expectation Maximization framework we then derive a differentiable clustering method that simultaneously learns how to group and represent individual entities. We evaluate our method on the (sequential) perceptual grouping task and find that it is able to accurately recover the constituent objects. We demonstrate that the learned representations are useful for next-step prediction.Comment: Accepted to NIPS 201

arXiv.org e-Print Archive

FigShare

Sprite Learning and Object Category Recognition using Invariant Features

Author: Allan Moray
Publication venue
Publication date: 01/01/2007
Field of study

Institute for Adaptive and Neural ComputationThis thesis explores the use of invariant features for learning sprites from image sequences, and for recognising object categories in images. A popular framework for the interpretation of image sequences is the layers or sprite model of e.g.Wang and Adelson (1994), Irani et al. (1994). Jojic and Frey (2001) provide a generative probabilistic model framework for this task, but their algorithm is slow as it needs to search over discretised transformations (e.g. translations, or affines) for each layer. We show that by using invariant features (e.g. Lowe’s SIFT features) and clustering their motions we can reduce or eliminate the search and thus learn the sprites much faster. The algorithm is demonstrated on example image sequences. We introduce the Generative Template of Features (GTF), a parts-based model for visual object category detection. The GTF consists of a number of parts, and for each part there is a corresponding spatial location distribution and a distribution over ‘visual words’ (clusters of invariant features). We evaluate the performance of the GTF model for object localisation as compared to other techniques, and show that such a relatively simple model can give state-of- the-art performance. We also discuss the connection of the GTF to Hough-transform-like methods for object localisation

Neural Diagrammatic Reasoning

Author: Wang Duo
Publication venue: University of Cambridge
Publication date: 01/08/2020
Field of study

Diagrams have been shown to be effective tools for humans to represent and reason about complex concepts. They have been widely used to represent concepts in science teaching, to communicate workflow in industries and to measure human fluid intelligence. Mechanised reasoning systems typically encode diagrams into symbolic representations that can be easily processed with rule-based expert systems. This relies on human experts to define the framework of diagram-to-symbol mapping and the set of rules to reason with the symbols. This means the reasoning systems cannot be easily adapted to other diagrams without a new set of human-defined representation mapping and reasoning rules. Moreover such systems are not able to cope with diagram inputs as raw and possibly noisy images. The need for human input and the lack of robustness to noise significantly limit the applications of mechanised diagrammatic reasoning systems. A key research question then arises: can we develop human-like reasoning systems that learn to reason robustly without predefined reasoning rules? To answer this question, I propose Neural Diagrammatic Reasoning, a new family of diagrammatic reasoning systems which does not have the drawbacks of mechanised reasoning systems. The new systems are based on deep neural networks, a recently popular machine learning method that achieved human-level performance on a range of perception tasks such as object detection, speech recognition and natural language processing. The proposed systems are able to learn both diagram to symbol mapping and implicit reasoning rules only from data, with no prior human input about symbols and rules in the reasoning tasks. Specifically I developed EulerNet, a novel neural network model that solves Euler diagram syllogism tasks with 99.5% accuracy. Experiments show that EulerNet learns useful representations of the diagrams and tasks, and is robust to noise and deformation in the input data. I also developed MXGNet, a novel multiplex graph neural architecture that solves Raven Progressive Matrices (RPM) tasks. MXGNet achieves state-of-the-art accuracies on two popular RPM datasets. In addition, I developed Discrete-AIR, an unsupervised learning architecture that learns semi-symbolic representations of diagrams without any labels. Lastly I designed a novel inductive bias module that can be readily used in today’s deep neural networks to improve their generalisation capability on relational reasoning tasks.EPSRC Studentship and Cambridge Trust Scholarshi