3,507 research outputs found
Training neural networks to encode symbols enables combinatorial generalization
Combinatorial generalization - the ability to understand and produce novel
combinations of already familiar elements - is considered to be a core capacity
of the human mind and a major challenge to neural network models. A significant
body of research suggests that conventional neural networks can't solve this
problem unless they are endowed with mechanisms specifically engineered for the
purpose of representing symbols. In this paper we introduce a novel way of
representing symbolic structures in connectionist terms - the vectors approach
to representing symbols (VARS), which allows training standard neural
architectures to encode symbolic knowledge explicitly at their output layers.
In two simulations, we show that neural networks not only can learn to produce
VARS representations, but in doing so they achieve combinatorial generalization
in their symbolic and non-symbolic output. This adds to other recent work that
has shown improved combinatorial generalization under specific training
conditions, and raises the question of whether specific mechanisms or training
routines are needed to support symbolic processing
Graph Element Networks: adaptive, structured computation and memory
We explore the use of graph neural networks (GNNs) to model spatial processes
in which there is no a priori graphical structure. Similar to finite element
analysis, we assign nodes of a GNN to spatial locations and use a computational
process defined on the graph to model the relationship between an initial
function defined over a space and a resulting function in the same space. We
use GNNs as a computational substrate, and show that the locations of the nodes
in space as well as their connectivity can be optimized to focus on the most
complex parts of the space. Moreover, this representational strategy allows the
learned input-output relationship to generalize over the size of the underlying
space and run the same model at different levels of precision, trading
computation for accuracy. We demonstrate this method on a traditional PDE
problem, a physical prediction problem from robotics, and learning to predict
scene images from novel viewpoints.Comment: Accepted to ICML 201
A mathematical theory of semantic development in deep neural networks
An extensive body of empirical research has revealed remarkable regularities
in the acquisition, organization, deployment, and neural representation of
human semantic knowledge, thereby raising a fundamental conceptual question:
what are the theoretical principles governing the ability of neural networks to
acquire, organize, and deploy abstract knowledge by integrating across many
individual experiences? We address this question by mathematically analyzing
the nonlinear dynamics of learning in deep linear networks. We find exact
solutions to this learning dynamics that yield a conceptual explanation for the
prevalence of many disparate phenomena in semantic cognition, including the
hierarchical differentiation of concepts through rapid developmental
transitions, the ubiquity of semantic illusions between such transitions, the
emergence of item typicality and category coherence as factors controlling the
speed of semantic processing, changing patterns of inductive projection over
development, and the conservation of semantic similarity in neural
representations across species. Thus, surprisingly, our simple neural model
qualitatively recapitulates many diverse regularities underlying semantic
development, while providing analytic insight into how the statistical
structure of an environment can interact with nonlinear deep learning dynamics
to give rise to these regularities
- …