Search CORE

2 research outputs found

Layer-compensated Pruning for Resource-constrained Convolutional Neural Networks

Author: Chin Ting-Wu
Marculescu Diana
Zhang Cha
Publication venue
Publication date: 17/10/2018
Field of study

Resource-efficient convolution neural networks enable not only the intelligence on edge devices but also opportunities in system-level optimization such as scheduling. In this work, we aim to improve the performance of resource-constrained filter pruning by merging two sub-problems commonly considered, i.e., (i) how many filters to prune for each layer and (ii) which filters to prune given a per-layer pruning budget, into a global filter ranking problem. Our framework entails a novel algorithm, dubbed layer-compensated pruning, where meta-learning is involved to determine better solutions. We show empirically that the proposed algorithm is superior to prior art in both effectiveness and efficiency. Specifically, we reduce the accuracy gap between the pruned and original networks from 0.9% to 0.7% with 8x reduction in time needed for meta-learning, i.e., from 1 hour down to 7 minutes. To this end, we demonstrate the effectiveness of our algorithm using ResNet and MobileNetV2 networks under CIFAR-10, ImageNet, and Bird-200 datasets.Comment: 11 pages, 8 figures, work in progres

arXiv.org e-Print Archive

TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators Towards Local and in Time Domain

Author: Li Haitong
Li Weitao
Lin Yingyan
Xie Yuan
Xu Pengfei
Zhao Yang
Publication venue
Publication date: 03/05/2020
Field of study

Resistive-random-access-memory (ReRAM) based processing-in-memory (R

^2

PIM) accelerators show promise in bridging the gap between Internet of Thing devices' constrained resources and Convolutional/Deep Neural Networks' (CNNs/DNNs') prohibitive energy cost. Specifically, R

^2

PIM accelerators enhance energy efficiency by eliminating the cost of weight movements and improving the computational density through ReRAM's high density. However, the energy efficiency is still limited by the dominant energy cost of input and partial sum (Psum) movements and the cost of digital-to-analog (D/A) and analog-to-digital (A/D) interfaces. In this work, we identify three energy-saving opportunities in R

^2

PIM accelerators: analog data locality, time-domain interfacing, and input access reduction, and propose an innovative R

^2

PIM accelerator called TIMELY, with three key contributions: (1) TIMELY adopts analog local buffers (ALBs) within ReRAM crossbars to greatly enhance the data locality, minimizing the energy overheads of both input and Psum movements; (2) TIMELY largely reduces the energy of each single D/A (and A/D) conversion and the total number of conversions by using time-domain interfaces (TDIs) and the employed ALBs, respectively; (3) we develop an only-once input read (O

^2

IR) mapping method to further decrease the energy of input accesses and the number of D/A conversions. The evaluation with more than 10 CNN/DNN models and various chip configurations shows that, TIMELY outperforms the baseline R

^2

PIM accelerator, PRIME, by one order of magnitude in energy efficiency while maintaining better computational density (up to 31.2

\times

) and throughput (up to 736.6

\times

). Furthermore, comprehensive studies are performed to evaluate the effectiveness of the proposed ALB, TDI, and O

^2

IR innovations in terms of energy savings and area reduction.Comment: Accepted by 47th International Symposium on Computer Architecture (ISCA'2020

arXiv.org e-Print Archive