Search CORE

145,106 research outputs found

UDC: Unified DNAS for Compressible TinyML Models

Author: Fedorov Igor
Matas Ramon
Mattina Matthew
Tann Hokchhay
Whatmough Paul
Zhou Chuteng
Publication venue
Publication date: 24/11/2022
Field of study

Deploying TinyML models on low-cost IoT hardware is very challenging, due to limited device memory capacity. Neural processing unit (NPU) hardware address the memory challenge by using model compression to exploit weight quantization and sparsity to fit more parameters in the same footprint. However, designing compressible neural networks (NNs) is challenging, as it expands the design space across which we must make balanced trade-offs. This paper demonstrates Unified DNAS for Compressible (UDC) NNs, which explores a large search space to generate state-of-the-art compressible NNs for NPU. ImageNet results show UDC networks are up to

3.35\times

smaller (iso-accuracy) or 6.25% more accurate (iso-model size) than previous work

arXiv.org e-Print Archive

String Figure: A Scalable and Elastic Memory Network Architecture

Author: IEEE
Miller Ethan L
Ogleari Matheus Almeida
Qian Chen
Yu Ye
Zhao Jishen
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Crossref

eScholarship - University of California

Optimal modularity and memory capacity of neural reservoirs

Author: Ahn Yong-Yeol
Izquierdo Eduardo
Rodriguez Nathaniel
Publication venue: 'MIT Press - Journals'
Publication date: 25/03/2019
Field of study

The neural network is a powerful computing framework that has been exploited by biological evolution and by humans for solving diverse problems. Although the computational capabilities of neural networks are determined by their structure, the current understanding of the relationships between a neural network's architecture and function is still primitive. Here we reveal that neural network's modular architecture plays a vital role in determining the neural dynamics and memory performance of the network of threshold neurons. In particular, we demonstrate that there exists an optimal modularity for memory performance, where a balance between local cohesion and global connectivity is established, allowing optimally modular networks to remember longer. Our results suggest that insights from dynamical analysis of neural networks and information spreading processes can be leveraged to better design neural networks and may shed light on the brain's modular organization

arXiv.org e-Print Archive

IUScholarWorks Open

End-to-end Incremental Learning

Author: Alahari Karteek
Castro Francisco,
Guil Nicolás
Marín-Jiménez Manuel,
Schmid Cordelia
Publication venue
Publication date: 06/07/2018
Field of study

Although deep learning approaches have stood out in recent years due to their state-of-the-art results, they continue to suffer from (catastrophic forgetting), a dramatic decrease in overall performance when training with new classes added incrementally. This is due to current neural network architectures requiring the entire dataset, consisting of all the samples from the old as well as the new classes, to update the model---a requirement that becomes easily unsustainable as the number of classes grows. We address this issue with our approach to learn deep neural networks incrementally, using new data and only a small exemplar set corresponding to samples from the old classes. This is based on a loss composed of a distillation measure to retain the knowledge acquired from the old classes, and a cross-entropy loss to learn the new classes. Our incremental training is achieved while keeping the entire framework end-to-end, i.e., learning the data representation and the classifier jointly, unlike recent methods with no such guarantees.This work has been funded by project TIC-1692 (Junta de Andalucía), TIN2016-80920R (Spanish Ministry of Science and Technology) and Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL Descartes

Repositorio Institucional Universidad de Málaga

HAL-Rennes 1

State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions

Author: Lawrence Carolin
Niepert Mathias
Wang Cheng
Publication venue
Publication date: 09/12/2022
Field of study

Recurrent neural networks are a widely used class of neural architectures. They have, however, two shortcomings. First, they are often treated as black-box models and as such it is difficult to understand what exactly they learn as well as how they arrive at a particular prediction. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in principle. We aim to address both shortcomings with a class of recurrent networks that use a stochastic state transition mechanism between cell applications. This mechanism, which we term state-regularization, makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) non-regular languages such as balanced parentheses and palindromes where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition and text categorisation. We show that state-regularization (a) simplifies the extraction of finite state automata that display an RNN's state transition dynamic; (b) forces RNNs to operate more like automata with external memory and less like finite state machines, which potentiality leads to a more structural memory; (c) leads to better interpretability and explainability of RNNs by leveraging the probabilistic finite state transition mechanism over time steps.Comment: To appear at IEEE Transactions on Pattern Analysis and Machine Intelligence. The extended version of State-Regularized Recurrent Neural Networks [arXiv:1901.08817

arXiv.org e-Print Archive