Search CORE

5,638 research outputs found

Constraining Implicit Space with Minimum Description Length: An Unsupervised Attention Mechanism across Neural Network Layers

Author: Lin Baihan
Publication venue
Publication date: 10/09/2020
Field of study

Inspired by the adaptation phenomenon of neuronal firing, we propose the regularity normalization (RN) as an unsupervised attention mechanism (UAM) which computes the statistical regularity in the implicit space of neural networks under the Minimum Description Length (MDL) principle. Treating the neural network optimization process as a partially observable model selection problem, UAM constrains the implicit space by a normalization factor, the universal code length. We compute this universal code incrementally across neural network layers and demonstrated the flexibility to include data priors such as top-down attention and other oracle information. Empirically, our approach outperforms existing normalization methods in tackling limited, imbalanced and non-stationary input distribution in image classification, classic control, procedurally-generated reinforcement learning, generative modeling, handwriting generation and question answering tasks with various neural network architectures. Lastly, UAM tracks dependency and critical learning stages across layers and recurrent time steps of deep networks

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

Lifelong Learning of Spatiotemporal Representations with Dual-Memory Recurrent Self-Organization

Author: Parisi German I.
Tani Jun
Weber Cornelius
Wermter Stefan
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Artificial autonomous agents and robots interacting in complex environments are required to continually acquire and fine-tune knowledge over sustained periods of time. The ability to learn from continuous streams of information is referred to as lifelong learning and represents a long-standing challenge for neural network models due to catastrophic forgetting. Computational models of lifelong learning typically alleviate catastrophic forgetting in experimental scenarios with given datasets of static images and limited complexity, thereby differing significantly from the conditions artificial agents are exposed to. In more natural settings, sequential information may become progressively available over time and access to previous experience may be restricted. In this paper, we propose a dual-memory self-organizing architecture for lifelong learning scenarios. The architecture comprises two growing recurrent networks with the complementary tasks of learning object instances (episodic memory) and categories (semantic memory). Both growing networks can expand in response to novel sensory experience: the episodic memory learns fine-grained spatiotemporal representations of object instances in an unsupervised fashion while the semantic memory uses task-relevant signals to regulate structural plasticity levels and develop more compact representations from episodic experience. For the consolidation of knowledge in the absence of external sensory input, the episodic memory periodically replays trajectories of neural reactivations. We evaluate the proposed model on the CORe50 benchmark dataset for continuous object recognition, showing that we significantly outperform current methods of lifelong learning in three different incremental learning scenario

arXiv.org e-Print Archive

OIST Institutional Repository

Directory of Open Access Journals

Frontiers - Publisher Connector

Sliced Cramer synaptic consolidation for preserving deeply learned representations

Author: Andrea Soltoggio (1248822)
Nicholas A Ketz (8612631)
Praveen K Pilly (8612634)
Soheil Kolouri (8612628)
Publication venue
Publication date: 14/03/2020
Field of study

Deep neural networks suffer from the inability to preserve the learned data representation (i.e., catastrophic forgetting) in domains where the input data distribution is non-stationary, and it changes during training. Various selective synaptic plasticity approaches have been recently proposed to preserve network parameters, which are crucial for previously learned tasks while learning new tasks. We explore such selective synaptic plasticity approaches through a unifying lens of memory replay and show the close relationship between methods like Elastic Weight Consolidation (EWC) and Memory-Aware-Synapses (MAS). We then propose a fundamentally different class of preservation methods that aim at preserving the distribution of the network’s output at an arbitrary layer for previous tasks while learning a new one. We propose the sliced Cramer distance as a suitable ´ choice for such preservation and evaluate our Sliced Cramer Preservation (SCP) ´ algorithm through extensive empirical investigations on various network architectures in both supervised and unsupervised learning settings. We show that SCP consistently utilizes the learning capacity of the network better than online-EWC and MAS methods on various incremental learning tasks

Loughborough University Institutional Repository

An Unsupervised Neural Network for Real-Time Low-Level Control of a Mobile Robot: Noise Resistance, Stability, and Hardware Implementation

Author: Coronado Juan Lopez
Gaudiano Paolo
Zalama Eduardo
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/07/1994
Field of study

We have recently introduced a neural network mobile robot controller (NETMORC). The controller is based on earlier neural network models of biological sensory-motor control. We have shown that NETMORC is able to guide a differential drive mobile robot to an arbitrary stationary or moving target while compensating for noise and other forms of disturbance, such as wheel slippage or changes in the robot's plant. Furthermore, NETMORC is able to adapt in response to long-term changes in the robot's plant, such as a change in the radius of the wheels. In this article we first review the NETMORC architecture, and then we prove that NETMORC is asymptotically stable. After presenting a series of simulations results showing robustness to disturbances, we compare NETMORC performance on a trajectory-following task with the performance of an alternative controller. Finally, we describe preliminary results on the hardware implementation of NETMORC with the mobile robot ROBUTER.Sloan Fellowship (BR-3122), Air Force Office of Scientific Research (F49620-92-J-0499

Boston University Institutional Repository (OpenBU)

Learning Deep Belief Networks from Non-Stationary Streams

Author: Calandra R
Deisenroth MP
Pouzols FM
Raiko T
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Deep learning has proven to be beneficial for complex tasks such as classifying images. However, this approach has been mostly applied to static datasets. The analysis of non-stationary (e.g., concept drift) streams of data involves specific issues connected with the temporal and changing nature of the data. In this paper, we propose a proof-of-concept method, called Adaptive Deep Belief Networks, of how deep learning can be generalized to learn online from changing streams of data. We do so by exploiting the generative properties of the model to incrementally re-train the Deep Belief Network whenever new data are collected. This approach eliminates the need to store past observations and, therefore, requires only constant memory consumption. Hence, our approach can be valuable for life-long learning from non-stationary data streams. © 2012 Springer-Verlag

TUbiblio

Crossref

Spiral - Imperial College Digital Repository

A Real-Time Unsupervised Neural Network for the Low-Level Control of a Mobile Robot in a Nonstationary Environment

Author: Coronado Juan Lopez
Gaudiano Paolo
Zalana Eduardo
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/01/1994
Field of study

This article introduces a real-time, unsupervised neural network that learns to control a two-degree-of-freedom mobile robot in a nonstationary environment. The neural controller, which is termed neural NETwork MObile Robot Controller (NETMORC), combines associative learning and Vector Associative Map (YAM) learning to generate transformations between spatial and velocity coordinates. As a result, the controller learns the wheel velocities required to reach a target at an arbitrary distance and angle. The transformations are learned during an unsupervised training phase, during which the robot moves as a result of randomly selected wheel velocities. The robot learns the relationship between these velocities and the resulting incremental movements. Aside form being able to reach stationary or moving targets, the NETMORC structure also enables the robot to perform successfully in spite of disturbances in the enviroment, such as wheel slippage, or changes in the robot's plant, including changes in wheel radius, changes in inter-wheel distance, or changes in the internal time step of the system. Finally, the controller is extended to include a module that learns an internal odometric transformation, allowing the robot to reach targets when visual input is sporadic or unreliable.Sloan Fellowship (BR-3122), Air Force Office of Scientific Research (F49620-92-J-0499

Boston University Institutional Repository (OpenBU)

Adaptive Resonance Theory

Author: Carpenter Gail A.
Grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/05/2009
Field of study

SyNAPSE program of the Defense Advanced Projects Research Agency (Hewlett-Packard Company, subcontract under DARPA prime contract HR0011-09-3-0001, and HRL Laboratories LLC, subcontract #801881-BS under DARPA prime contract HR0011-09-C-0001); CELEST, an NSF Science of Learning Center (SBE-0354378

Boston University Institutional Repository (OpenBU)

Measuring Catastrophic Forgetting in Neural Networks

Author: Abitino Angelina
Hayes Tyler
Kanan Christopher
Kemker Ronald
McClure Marc
Publication venue
Publication date: 09/11/2017
Field of study

Deep neural networks are used in many state-of-the-art systems for machine perception. Once a network is trained to do a specific task, e.g., bird classification, it cannot easily be trained to do new tasks, e.g., incrementally learning to recognize additional bird species or learning an entirely different task such as flower recognition. When new tasks are added, typical deep neural networks are prone to catastrophically forgetting previous tasks. Networks that are capable of assimilating new information incrementally, much like how humans form new memories over time, will be more efficient than re-training the model from scratch each time a new task needs to be learned. There have been multiple attempts to develop schemes that mitigate catastrophic forgetting, but these methods have not been directly compared, the tests used to evaluate them vary considerably, and these methods have only been evaluated on small-scale problems (e.g., MNIST). In this paper, we introduce new metrics and benchmarks for directly comparing five different mechanisms designed to mitigate catastrophic forgetting in neural networks: regularization, ensembling, rehearsal, dual-memory, and sparse-coding. Our experiments on real-world images and sounds show that the mechanism(s) that are critical for optimal performance vary based on the incremental training paradigm and type of data being used, but they all demonstrate that the catastrophic forgetting problem has yet to be solved.Comment: To appear in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications