2 research outputs found
CARLA: A Convolution Accelerator with a Reconfigurable and Low-Energy Architecture
Convolutional Neural Networks (CNNs) have proven to be extremely accurate for
image recognition, even outperforming human recognition capability. When
deployed on battery-powered mobile devices, efficient computer architectures
are required to enable fast and energy-efficient computation of costly
convolution operations. Despite recent advances in hardware accelerator design
for CNNs, two major problems have not yet been addressed effectively,
particularly when the convolution layers have highly diverse structures: (1)
minimizing energy-hungry off-chip DRAM data movements; (2) maximizing the
utilization factor of processing resources to perform convolutions. This work
thus proposes an energy-efficient architecture equipped with several optimized
dataflows to support the structural diversity of modern CNNs. The proposed
approach is evaluated by implementing convolutional layers of VGGNet-16 and
ResNet-50. Results show that the architecture achieves a Processing Element
(PE) utilization factor of 98% for the majority of 3x3 and 1x1 convolutional
layers, while limiting latency to 396.9 ms and 92.7 ms when performing
convolutional layers of VGGNet-16 and ResNet-50, respectively. In addition, the
proposed architecture benefits from the structured sparsity in ResNet-50 to
reduce the latency to 42.5 ms when half of the channels are pruned.Comment: 12 page
Energy-Efficient, Flexible and Fast Architectures for Deep Convolutional Neural Network Acceleration
RÉSUMÉ: Les méthodes basées sur l'apprentissage profond, et en particulier les réseaux de neurones convolutifs (CNN), ont révolutionné le domaine de la vision par ordinateur. Alors que jusqu'en 2012, les méthodes de traitement d'image traditionnelles les plus précises pouvaient atteindre 26% d'erreurs dans la reconnaissance d'images sur l'étalon normalisé et bien connu ImageNet, une méthode basée sur un CNN a considérablement réduit l'erreur à 16%. En faisant évoluer la structure des CNN, les méthodes actuelles basées sur des CNN atteignent désormais couramment des taux d'erreur inférieurs à 3%, dépassant souvent la précision humaine. Les CNN se composent de nombreuses couches convolutives, chacune effectuant des opérations de convolution complexes de haute dimension. Pour obtenir une précision élevée en reconnaissance d’images, les CNN modernes empilent de nombreuses couches convolutives, ce qui augmente considérablement la diversité des motifs de calcul entre les couches. Ce haut niveau de complexité dans les CNN implique un nombre massif de paramètres et de calculs.----------ABSTRACT: Deep learning-based methods, and specifically Convolutional Neural Networks (CNNs), have revolutionized the field of computer vision. While until 2012, the most accurate traditional image processing methods could reach 26% errors in recognizing images on the standardized and well-known ImageNet benchmark, a CNN-based method dramatically reduced the error to 16%. By evolving CNNs structures, current CNN-based methods now routinely achieve error rates below 3%, often outperforming human level accuracy. CNNs consist of many convolutional layers each performing high dimensional complex convolution operations. To achieve high image recognition accuracy, modern CNNs stack many convolutional layers which dramatically increases computation pattern diversity across layers. This high level of complexity in CNNs implies massive numbers of parameters and computations. Since mobile processors are not designed to perform massive computations, deploying CNNs on portable and mobile devices is challenging