2 research outputs found
Towards Modular Algorithm Induction
We present a modular neural network architecture Main that learns algorithms
given a set of input-output examples. Main consists of a neural controller that
interacts with a variable-length input tape and learns to compose modules
together with their corresponding argument choices. Unlike previous approaches,
Main uses a general domain-agnostic mechanism for selection of modules and
their arguments. It uses a general input tape layout together with a parallel
history tape to indicate most recently used locations. Finally, it uses a
memoryless controller with a length-invariant self-attention based input tape
encoding to allow for random access to tape locations. The Main architecture is
trained end-to-end using reinforcement learning from a set of input-output
examples. We evaluate Main on five algorithmic tasks and show that it can learn
policies that generalizes perfectly to inputs of much longer lengths than the
ones used for training.Comment: 10 pages, 4 figures, 2 table
Matrix Shuffle-Exchange Networks for Hard 2D Tasks
Convolutional neural networks have become the main tools for processing
two-dimensional data. They work well for images, yet convolutions have a
limited receptive field that prevents its applications to more complex 2D
tasks. We propose a new neural model, called Matrix Shuffle-Exchange network,
that can efficiently exploit long-range dependencies in 2D data and has
comparable speed to a convolutional neural network. It is derived from Neural
Shuffle-Exchange network and has layers and
total time and space complexity for processing a data matrix. We show that the Matrix Shuffle-Exchange network is
well-suited for algorithmic and logical reasoning tasks on matrices and dense
graphs, exceeding convolutional and graph neural network baselines. Its
distinct advantage is the capability of retaining full long-range dependency
modelling when generalizing to larger instances - much larger than could be
processed with models equipped with a dense attention mechanism