Search CORE

33,581 research outputs found

EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices

Author: Almeida Mario
Lane Nicholas D.
Laskaridis Stefanos
Leontiadis Ilias
Venieris Stylianos I.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

In recent years, advances in deep learning have resulted in unprecedented leaps in diverse tasks spanning from speech and object recognition to context awareness and health monitoring. As a result, an increasing number of AI-enabled applications are being developed targeting ubiquitous and mobile devices. While deep neural networks (DNNs) are getting bigger and more complex, they also impose a heavy computational and energy burden on the host devices, which has led to the integration of various specialized processors in commodity devices. Given the broad range of competing DNN architectures and the heterogeneity of the target hardware, there is an emerging need to understand the compatibility between DNN-platform pairs and the expected performance benefits on each platform. This work attempts to demystify this landscape by systematically evaluating a collection of state-of-the-art DNNs on a wide variety of commodity devices. In this respect, we identify potential bottlenecks in each architecture and provide important guidelines that can assist the community in the co-design of more efficient DNNs and accelerators.Comment: Accepted at MobiSys 2019: 3rd International Workshop on Embedded and Mobile Deep Learning (EMDL), 201

arXiv.org e-Print Archive

Crossref

Resource Efficient Authentication and Session Key Establishment Procedure for Low-Resource IoT Devices

Author: Al-Bayatti Ali Hilal
Alalwan Nasser
Alfarrj Osama
Alzahrani Ahmed
Khan Sarmadullah
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/11/2019
Field of study

open access journalThe Internet of Things (IoT) can includes many resource-constrained devices, with most usually needing to securely communicate with their network managers, which are more resource-rich devices in the IoT network. We propose a resource-efficient security scheme that includes authentication of devices with their network managers, authentication between devices on different networks, and an attack-resilient key establishment procedure. Using automated validation with internet security protocols and applications tool-set, we analyse several attack scenarios to determine the security soundness of the proposed solution, and then we evaluate its performance analytically and experimentally. The performance analysis shows that the proposed solution occupies little memory and consumes low energy during the authentication and key generation processes respectively. Moreover, it protects the network from well-known attacks (man-in-the-middle attacks, replay attacks, impersonation attacks, key compromission attacks and denial of service attacks)

De Montfort University Open Research Archive

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

Author: Bouganis Christos-Savvas
Kouris Alexandros
Venieris Stylianos I.
Publication venue
Publication date: 19/02/2018
Field of study

In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository