Search CORE

8 research outputs found

Searching Toward Pareto-Optimal Device-Aware Neural Architectures

Author: Baker Bowen
Bender Gabriel
Brock Andrew
Cai Han
Domhan Tobias
Goldberg David E
Howard Andrew G
Hsu Chi-Hung
Huang Gao
Kim Ye-Hoon
Liu Hanxiao
Liu Hanxiao
Mendoza Hector
Negrinho Renato
Real Esteban
Zhang Xiangyu
Zhong Zhao
Zoph Barret
Publication venue
Publication date: 29/08/2018
Field of study

Recent breakthroughs in Neural Architectural Search (NAS) have achieved state-of-the-art performance in many tasks such as image classification and language understanding. However, most existing works only optimize for model accuracy and largely ignore other important factors imposed by the underlying hardware and devices, such as latency and energy, when making inference. In this paper, we first introduce the problem of NAS and provide a survey on recent works. Then we deep dive into two recent advancements on extending NAS into multiple-objective frameworks: MONAS and DPP-Net. Both MONAS and DPP-Net are capable of optimizing accuracy and other objectives imposed by devices, searching for neural architectures that can be best deployed on a wide spectrum of devices: from embedded systems and mobile devices to workstations. Experimental results are poised to show that architectures found by MONAS and DPP-Net achieves Pareto optimality w.r.t the given objectives for various devices.Comment: ICCAD'18 Invited Pape

arXiv.org e-Print Archive

Crossref

Performance Profiling of Embedded ConvNets under Thermal-Aware DVFS

Author: Calimera Andrea
Peluso Valentino
Rizzo Roberto Giorgio
Publication venue: 'MDPI AG'
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Optimality Assessment of Memory-Bounded ConvNets Deployed on Resource-Constrained RISC Cores

Author: Andrea Calimera
Matteo Grimaldi
Valentino Peluso
Publication venue
Publication date: 01/01/2019
Field of study

A cost-effective implementation of Convolutional Neural Nets on the mobile edge of the Internet-of-Things (IoT) requires smart optimizations to fit large models into memory-constrained cores. Reduction methods that use a joint combination of filter pruning and weight quantization have proven efficient in searching the compression that ensures minimum model size without accuracy loss. However, there exist other optimal configurations that stem from the memory constraint. The objective of this work is to make an assessment of such memory-bounded implementations and to show that most of them are centred on specific parameter settings that are found difficult to be implemented on a low-power RISC. Hence, the focus is on quantifying the distance to optimality of the closest implementations that instead can be actually deployed on hardware. The analysis is powered by a two-stage framework that efficiently explores the memory-accuracy space using a lightweight, hardware-conscious heuristic optimization. Results are collected from three realistic IoT tasks (Image Classification on CIFAR-10, Keyword Spotting on the Speech Commands Dataset, Facial Expression Recognition on Fer2013) run on RISC cores (Cortex-M by ARM) with few hundreds KB of on-chip RAM

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Open Access Repository