Search CORE

35 research outputs found

In-memory Realization of In-situ Few-shot Continual Learning with a Dynamically Evolving Explicit Memory

Author: Ahsan Ishtiaq
Benini Luca
Brew Kevin
Chan Victor
Cherubini Giovanni
Choi Sam
Egger Urs
Gallo-Bourdeau Manuel Le
Hersche Michael
Karunaratne Geethan
Langenegger Jovin
Li Ning
Narayanan Vijay
OK INJO
Rahimi Abbas
Saulnier Nicole
Sebastian Abu
Silvestre Mary Claire
Publication venue
Publication date: 14/07/2022
Field of study

Continually learning new classes from a few training examples without forgetting previous old classes demands a flexible architecture with an inevitably growing portion of storage, in which new examples and classes can be incrementally stored and efficiently retrieved. One viable architectural solution is to tightly couple a stationary deep neural network to a dynamically evolving explicit memory (EM). As the centerpiece of this architecture, we propose an EM unit that leverages energy-efficient in-memory compute (IMC) cores during the course of continual learning operations. We demonstrate for the first time how the EM unit can physically superpose multiple training examples, expand to accommodate unseen classes, and perform similarity search during inference, using operations on an IMC core based on phase-change memory (PCM). Specifically, the physical superposition of a few encoded training examples is realized via in-situ progressive crystallization of PCM devices. The classification accuracy achieved on the IMC core remains within a range of 1.28%--2.5% compared to that of the state-of-the-art full-precision baseline software model on both the CIFAR-100 and miniImageNet datasets when continually learning 40 novel classes (from only five examples per class) on top of 60 old classes.Comment: Accepted at the European Solid-state Devices and Circuits Conference (ESSDERC), September 202

arXiv.org e-Print Archive

Recommended from our members

Improving deep learning through loss-function evolution

Author: Gonzalez Santiago
Publication venue
Publication date: 06/05/2021
Field of study

As the complexity of neural network models has grown, it has become increasingly important to optimize their design automatically through metalearning. Methods for discovering hyperparameters, topologies, and learning rate schedules have lead to significant increases in performance. This dissertation tackles a new type of metalearning: loss-function optimization. Loss functions define a model's core training objective and thus present a clear opportunity. Two techniques, GLO and TaylorGLO, were developed to tackle this metalearning problem using genetic programming and evolutionary strategies. Experiments show that neural networks trained with metalearned loss functions are more accurate, have higher data utilization, train faster, and are more robust against adversarial attacks. A theoretical framework was developed to analyze how and why different loss functions bias training towards different regions of the parameter space. Using this framework, their performance gains are found to result from a regularizing effect that is tailored to each domain. Overall, this dissertation demonstrates that new, metalearned loss functions can result in better trained models, and provides the next stepping stone towards fully automated machine learning.Computer Science

Texas ScholarWorks

Recommended from our members

Biological learning in key-value memory networks

Author: Dong Ling Liang
Fang Ching
Tyulmankov Danil
Vadaparty Annapurna
Yang Guangyu Robert
Publication venue
Publication date: 01/01/2022
Field of study

In neuroscience, classical Hopfield networks are the standard biologically plausible model of long-term memory, relying on Hebbian plasticity for storage and attractor dynamics for recall. In contrast, memory-augmented neural networks in machine learning commonly use a key-value mechanism to store and read out memories in a single step. Such augmented networks achieve impressive feats of memory compared to traditional variants, yet it remains unclear whether they can be implemented by biological systems. In our work, we bridge this gap by proposing a set of of biologically plausible three-factor plasticity rules for a basic feedforward key-value memory network. Keys are stored in the input-to-hidden synaptic weights by a "non-Hebbian" rule, controlled only by pre-synaptic activity, and modulated by local third factors which represent dendtritic spikes. Values are stored in the hidden-to-output weights by a Hebbian rule, with the pre-synaptic neuron selected through softmax attention which represents recurrent inhibition. The same rules are recovered when network parameters are meta-learned. Our network performs on par with classical Hopfield networks on autoassociative memory tasks and can be naturally extended to correlated inputs, continual recall, heteroassociative memory, and sequence learning. Importantly, since memories are stored in slots indexed by hidden layer neurons, unlike the fully distributed representation in the classical Hopfield network, they can be individually selected for extended storage or rapid decay. Finally, our memory network can easily be incorporated into a larger neural system, either as a memory bank for an external controller, or as a fast learning system used in conjunction with a slow one. Overall, our results suggest a compelling alternative to the classical Hopfield network as a model of biological long-term memory. Keywords: learning, memory, synaptic plasticity, Hebbian, key-value memory, neural network, three-factor plasticit

Columbia University Academic Commons