Search CORE

23,744 research outputs found

Inherent Weight Normalization in Stochastic Neural Networks

Author: Datta Suman
Detorakis Georgios
Dutta Sourav
Jerry Matthew
Khanna Abhishek
Neftci Emre
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Multiplicative stochasticity such as Dropout improves the robustness and generalizability of deep neural networks. Here, we further demonstrate that always-on multiplicative stochasticity combined with simple threshold neurons are sufficient operations for deep neural networks. We call such models Neural Sampling Machines (NSM). We find that the probability of activation of the NSM exhibits a self-normalizing property that mirrors Weight Normalization, a previously studied mechanism that fulfills many of the features of Batch Normalization in an online fashion. The normalization of activities during training speeds up convergence by preventing internal covariate shift caused by changes in the input distribution. The always-on stochasticity of the NSM confers the following advantages: the network is identical in the inference and learning phases, making the NSM suitable for online learning, it can exploit stochasticity inherent to a physical substrate such as analog non-volatile memories for in-memory computing, and it is suitable for Monte Carlo sampling, while requiring almost exclusively addition and comparison operations. We demonstrate NSMs on standard classification benchmarks (MNIST and CIFAR) and event-based classification benchmarks (N-MNIST and DVS Gestures). Our results show that NSMs perform comparably or better than conventional artificial neural networks with the same architecture

arXiv.org e-Print Archive

eScholarship - University of California

Online Learning of a Memory for Learning Rates

Author: Kappler Daniel
Meier Franziska
Schaal Stefan
Publication venue
Publication date: 01/01/2018
Field of study

The promise of learning to learn for robotics rests on the hope that by extracting some information about the learning process itself we can speed up subsequent similar learning tasks. Here, we introduce a computationally efficient online meta-learning algorithm that builds and optimizes a memory model of the optimal learning rate landscape from previously observed gradient behaviors. While performing task specific optimization, this memory of learning rates predicts how to scale currently observed gradients. After applying the gradient scaling our meta-learner updates its internal memory based on the observed effect its prediction had. Our meta-learner can be combined with any gradient-based optimizer, learns on the fly and can be transferred to new optimization tasks. In our evaluations we show that our meta-learning algorithm speeds up learning of MNIST classification and a variety of learning control tasks, either in batch or online learning settings.Comment: accepted to ICRA 2018, code available: https://github.com/fmeier/online-meta-learning ; video pitch available: https://youtu.be/9PzQ25FPPO

arXiv.org e-Print Archive

MPG.PuRe

Neural Network Models of Learning and Memory: Leading Questions and an Emerging Framework

Author: Carpenter Gail
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/10/2000
Field of study

Office of Naval Research and the Defense Advanced Research Projects Agency (N00014-95-1-0409, N00014-1-95-0657); National Institutes of Health (NIH 20-316-4304-5

Boston University Institutional Repository (OpenBU)

Maximum Likelihood Associative Memories

Author: Gripon Vincent
Rabbat Michael
Publication venue
Publication date: 20/04/2013
Field of study

Associative memories are structures that store data in such a way that it can later be retrieved given only a part of its content -- a sort-of error/erasure-resilience property. They are used in applications ranging from caches and memory management in CPUs to database engines. In this work we study associative memories built on the maximum likelihood principle. We derive minimum residual error rates when the data stored comes from a uniform binary source. Second, we determine the minimum amount of memory required to store the same data. Finally, we bound the computational complexity for message retrieval. We then compare these bounds with two existing associative memory architectures: the celebrated Hopfield neural networks and a neural network architecture introduced more recently by Gripon and Berrou

arXiv.org e-Print Archive

Crossref

HAL-Université de Bretagne Occidentale

HAL Descartes

Hal-Diderot

Dreaming neural networks: forgetting spurious memories and reinforcing pure ones

Author: Agliari Elena
Barra Adriano
Fachechi Alberto
Publication venue
Publication date: 29/10/2018
Field of study

The standard Hopfield model for associative neural networks accounts for biological Hebbian learning and acts as the harmonic oscillator for pattern recognition, however its maximal storage capacity is

\alpha \sim 0.14

, far from the theoretical bound for symmetric networks, i.e.

\alpha =1

. Inspired by sleeping and dreaming mechanisms in mammal brains, we propose an extension of this model displaying the standard on-line (awake) learning mechanism (that allows the storage of external information in terms of patterns) and an off-line (sleep) unlearning

\&

consolidating mechanism (that allows spurious-pattern removal and pure-pattern reinforcement): this obtained daily prescription is able to saturate the theoretical bound

\alpha=1

, remaining also extremely robust against thermal noise. Both neural and synaptic features are analyzed both analytically and numerically. In particular, beyond obtaining a phase diagram for neural dynamics, we focus on synaptic plasticity and we give explicit prescriptions on the temporal evolution of the synaptic matrix. We analytically prove that our algorithm makes the Hebbian kernel converge with high probability to the projection matrix built over the pure stored patterns. Furthermore, we obtain a sharp and explicit estimate for the "sleep rate" in order to ensure such a convergence. Finally, we run extensive numerical simulations (mainly Monte Carlo sampling) to check the approximations underlying the analytical investigations (e.g., we developed the whole theory at the so called replica-symmetric level, as standard in the Amit-Gutfreund-Sompolinsky reference framework) and possible finite-size effects, finding overall full agreement with the theory.Comment: 31 pages, 12 figure

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Archivio Istituzionale della Ricerca- Università del Salento

An analog feedback associative memory

Author: Abu-Mostafa Yaser S.
Atiya Amir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1993
Field of study

A method for the storage of analog vectors, i.e., vectors whose components are real-valued, is developed for the Hopfield continuous-time network. An important requirement is that each memory vector has to be an asymptotically stable (i.e. attractive) equilibrium of the network. Some of the limitations imposed by the continuous Hopfield model on the set of vectors that can be stored are pointed out. These limitations can be relieved by choosing a network containing visible as well as hidden units. An architecture consisting of several hidden layers and a visible layer, connected in a circular fashion, is considered. It is proved that the two-layer case is guaranteed to store any number of given analog vectors provided their number does not exceed 1 + the number of neurons in the hidden layer. A learning algorithm that correctly adjusts the locations of the equilibria and guarantees their asymptotic stability is developed. Simulation results confirm the effectiveness of the approach

CiteSeerX

Caltech Authors

Second Order Neural Networks.

Author: Das Sanjoy
Publication venue: LSU Digital Commons
Publication date: 01/01/1994
Field of study

In this dissertation, a feedback neural network model has been proposed. This network uses a second order method of convergence based on the Newton-Raphson method. This neural network has both discrete as well as continuous versions. When used as an associative memory, the proposed model has been called the polynomial neural network (PNN). The memories of this network can be located anywhere in an n dimensional space rather than being confined to the corners of the latter. A method for storing memories has been proposed. This is a single step method unlike the currently known computationally intensive iterative methods. An energy function for the polynomial neural network has been suggested. Issues relating to the error-correcting ability of this network have been addressed. Additionally, it has been found that the attractor basins of the memories of this network reveal a curious fractal topology, thereby suggesting a highly complex and often unpredictable nature. The use of the second order neural network as a function optimizer has also been shown. While issues relating to the hardware realization of this network have only been addressed briefly, it has been indicated that such a network would have a large amount of hardware for its realization. This problem can be obviated by using a simplified model that has also been described. The performance of this simplified model is comparable to that of the basic model while requiring much less hardware for its realization

Louisiana State University

Recommended from our members

Versatile stochastic dot product circuits based on nonvolatile memories for high performance neurocomputing and neurooptimization.

Author: Mahmoodi MR
Prezioso M
Strukov DB
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

The key operation in stochastic neural networks, which have become the state-of-the-art approach for solving problems in machine learning, information theory, and statistics, is a stochastic dot-product. While there have been many demonstrations of dot-product circuits and, separately, of stochastic neurons, the efficient hardware implementation combining both functionalities is still missing. Here we report compact, fast, energy-efficient, and scalable stochastic dot-product circuits based on either passively integrated metal-oxide memristors or embedded floating-gate memories. The circuit's high performance is due to mixed-signal implementation, while the efficient stochastic operation is achieved by utilizing circuit's noise, intrinsic and/or extrinsic to the memory cell array. The dynamic scaling of weights, enabled by analog memory devices, allows for efficient realization of different annealing approaches to improve functionality. The proposed approach is experimentally verified for two representative applications, namely by implementing neural network for solving a four-node graph-partitioning problem, and a Boltzmann machine with 10-input and 8-hidden neurons

eScholarship - University of California