Search CORE

745,375 research outputs found

Memory Networks

Author: Bordes Antoine
Chopra Sumit
Weston Jason
Publication venue
Publication date: 29/11/2015
Field of study

We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. We investigate these models in the context of question answering (QA) where the long-term memory effectively acts as a (dynamic) knowledge base, and the output is a textual response. We evaluate them on a large-scale QA task, and a smaller, but more complex, toy task generated from a simulated world. In the latter, we show the reasoning power of such models by chaining multiple supporting sentences to answer questions that require understanding the intension of verbs

arXiv.org e-Print Archive

CiteSeerX

Flexible Memory Networks

Author: Curto Carina
Degeratu Anda
Itskov Vladimir
Publication venue
Publication date: 01/07/2011
Field of study

Networks of neurons in some brain areas are flexible enough to encode new memories quickly. Using a standard firing rate model of recurrent networks, we develop a theory of flexible memory networks. Our main results characterize networks having the maximal number of flexible memory patterns, given a constraint graph on the network's connectivity matrix. Modulo a mild topological condition, we find a close connection between maximally flexible networks and rank 1 matrices. The topological condition is H_1(X;Z)=0, where X is the clique complex associated to the network's constraint graph; this condition is generically satisfied for large random networks that are not overly sparse. In order to prove our main results, we develop some matrix-theoretic tools and present them in a self-contained section independent of the neuroscience context.Comment: Accepted to Bulletin of Mathematical Biology, 11 July 201

arXiv.org e-Print Archive

DigitalCommons@University of Nebraska

Linear Memory Networks

Author: Bacciu Davide
Carta Antonio
Sperduti Alessandro
Publication venue
Publication date: 08/11/2018
Field of study

Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled. We introduce a novel recurrent architecture based on the conceptual separation between the functional input-output transformation and the memory mechanism, showing how they can be implemented through different neural components. By building on such conceptualization, we introduce the Linear Memory Network, a recurrent model comprising a feedforward neural network, realizing the non-linear functional transformation, and a linear autoencoder for sequences, implementing the memory component. The resulting architecture can be efficiently trained by building on closed-form solutions to linear optimization problems. Further, by exploiting equivalence results between feedforward and recurrent neural networks we devise a pretraining schema for the proposed architecture. Experiments on polyphonic music datasets show competitive results against gated recurrent networks and other state of the art models

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Memory Augmented Control Networks

Author: Atanasov Nikolay
Karydis Konstantinos
Khan Arbaaz
Kumar Vijay
Lee Daniel D.
Zhang Clark
Publication venue
Publication date: 27/12/2017
Field of study

Planning problems in partially observable environments cannot be solved directly with convolutional networks and require some form of memory. But, even memory networks with sophisticated addressing schemes are unable to learn intelligent reasoning satisfactorily due to the complexity of simultaneously learning to access memory and plan. To mitigate these challenges we introduce the Memory Augmented Control Network (MACN). The proposed network architecture consists of three main parts. The first part uses convolutions to extract features and the second part uses a neural network-based planning module to pre-plan in the environment. The third part uses a network controller that learns to store those specific instances of past information that are necessary for planning. The performance of the network is evaluated in discrete grid world environments for path planning in the presence of simple and complex obstacles. We show that our network learns to plan and can generalize to new environments

arXiv.org e-Print Archive

eScholarship - University of California