Search CORE

17 research outputs found

Sparse Training Theory for Scalable and Efficient Agents

Author: Curci Selima
Ernst Damien
Gibescu Madeleine
Mocanu Decebal Constantin
Mocanu Elena
Nguyen Phuong H.
Pinto Tiago
Vale Zita A.
Publication venue
Publication date: 01/01/2021
Field of study

A fundamental task for artificial intelligence is learning. Deep Neural Networks have proven to cope perfectly with all learning paradigms, i.e. supervised, unsupervised, and reinforcement learning. Nevertheless, traditional deep learning approaches make use of cloud computing facilities and do not scale well to autonomous agents with low computational resources. Even in the cloud, they suffer from computational and memory limitations, and they cannot be used to model adequately large physical worlds for agents which assume networks with billions of neurons. These issues are addressed in the last few years by the emerging topic of sparse training, which trains sparse networks from scratch. This paper discusses sparse training state-of-the-art, its challenges and limitations while introducing a couple of new theoretical research directions which has the potential of alleviating sparse training limitations to push deep learning scalability well beyond its current boundaries. Nevertheless, the theoretical advancements impact in complex multi-agents settings is discussed from a real-world perspective, using the smart grid case study

arXiv.org e-Print Archive

Pure OAI Repository

Open Repository and Bibliography - Liège

University of Twente Research Information

TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

Author: Albericio Jorge
Awad Omar Mohamed
Edo Isak
Mahmoud Mostafa
Moshovos Andreas
Pekhimenko Gennady
Zadeh Ali Hadi
Publication venue
Publication date: 01/09/2020
Field of study

TensorDash is a hardware level technique for enabling data-parallel MAC units to take advantage of sparsity in their input operand streams. When used to compose a hardware accelerator for deep learning, TensorDash can speedup the training process while also increasing energy efficiency. TensorDash combines a low-cost, sparse input operand interconnect comprising an 8-input multiplexer per multiplier input, with an area-efficient hardware scheduler. While the interconnect allows a very limited set of movements per operand, the scheduler can effectively extract sparsity when it is present in the activations, weights or gradients of neural networks. Over a wide set of models covering various applications, TensorDash accelerates the training process by

1.95{\times}

while being

1.89\times

more energy-efficient,

1.6\times

more energy efficient when taking on-chip and off-chip memory accesses into account. While TensorDash works with any datatype, we demonstrate it with both single-precision floating-point units and bfloat16

arXiv.org e-Print Archive

Crossref