Search CORE

13,663 research outputs found

Accelerating Deep Learning Inference on Mobile Systems

Author: Abhishek Sehgal
D Frajberg
HD Cho
HT Dinh
R Fedorov
S AlEbrahim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Artificial Intelligence on the edge is a matter of great importance towards the enhancement of smart devices that rely on operations with real-time constraints. We present PolimiDL, a framework for the acceleration of Deep Learning on mobile and embedded systems with limited resources and heterogeneous architectures. Experimental results show competitive results with respect to TensorFlow Lite for the execution of small models

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices

Author: He Kaiming
Ioffe Sergey
Kim Yong-Deok
Krizhevsky Alex
Lane Nicholas D
Lin Min
Szegedy Christian
Wu Jiaxiang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/09/2017
Field of study

Convolutional Neural Networks (CNNs) have revolutionized the research in computer vision, due to their ability to capture complex patterns, resulting in high inference accuracies. However, the increasingly complex nature of these neural networks means that they are particularly suited for server computers with powerful GPUs. We envision that deep learning applications will be eventually and widely deployed on mobile devices, e.g., smartphones, self-driving cars, and drones. Therefore, in this paper, we aim to understand the resource requirements (time, memory) of CNNs on mobile devices. First, by deploying several popular CNNs on mobile CPUs and GPUs, we measure and analyze the performance and resource usage for every layer of the CNNs. Our findings point out the potential ways of optimizing the performance on mobile devices. Second, we model the resource requirements of the different CNN computations. Finally, based on the measurement, pro ling, and modeling, we build and evaluate our modeling tool, Augur, which takes a CNN configuration (descriptor) as the input and estimates the compute time and resource usage of the CNN, to give insights about whether and how e ciently a CNN can be run on a given mobile platform. In doing so Augur tackles several challenges: (i) how to overcome pro ling and measurement overhead; (ii) how to capture the variance in different mobile platforms with different processors, memory, and cache sizes; and (iii) how to account for the variance in the number, type and size of layers of the different CNN configurations

arXiv.org e-Print Archive

Crossref