2 research outputs found
Low Latency CMOS Hardware Acceleration for Fully Connected Layers in Deep Neural Networks
We present a novel low latency CMOS hardware accelerator for fully connected
(FC) layers in deep neural networks (DNNs). The FC accelerator, FC-ACCL, is
based on 128 8x8 or 16x16 processing elements (PEs) for matrix-vector
multiplication, and 128 multiply-accumulate (MAC) units integrated with 128
High Bandwidth Memory (HBM) units for storing the pretrained weights.
Micro-architectural details for CMOS ASIC implementations are presented and
simulated performance is compared to recent hardware accelerators for DNNs for
AlexNet and VGG 16. When comparing simulated processing latency for a 4096-1000
FC8 layer, our FC-ACCL is able to achieve 48.4 GOPS (with a 100 MHz clock)
which improves on a recent FC8 layer accelerator quoted at 28.8 GOPS with a 150
MHz clock. We have achieved this considerable improvement by fully utilizing
the HBM units for storing and reading out column-specific FClayer weights in 1
cycle with a novel colum-row-column schedule, and implementing a maximally
parallel datapath for processing these weights with the corresponding MAC and
PE units. When up-scaled to 128 16x16 PEs, for 16x16 tiles of weights, the
design can reduce latency for the large FC6 layer by 60 % in AlexNet and by 3 %
in VGG16 when compared to an alternative EIE solution which uses compression
A Survey of Neuromorphic Computing and Neural Networks in Hardware
Neuromorphic computing has come to refer to a variety of brain-inspired
computers, devices, and models that contrast the pervasive von Neumann computer
architecture. This biologically inspired approach has created highly connected
synthetic neurons and synapses that can be used to model neuroscience theories
as well as solve challenging machine learning problems. The promise of the
technology is to create a brain-like ability to learn and adapt, but the
technical challenges are significant, starting with an accurate neuroscience
model of how the brain works, to finding materials and engineering
breakthroughs to build devices to support these models, to creating a
programming framework so the systems can learn, to creating applications with
brain-like capabilities. In this work, we provide a comprehensive survey of the
research and motivations for neuromorphic computing over its history. We begin
with a 35-year review of the motivations and drivers of neuromorphic computing,
then look at the major research areas of the field, which we define as
neuro-inspired models, algorithms and learning approaches, hardware and
devices, supporting systems, and finally applications. We conclude with a broad
discussion on the major research topics that need to be addressed in the coming
years to see the promise of neuromorphic computing fulfilled. The goals of this
work are to provide an exhaustive review of the research conducted in
neuromorphic computing since the inception of the term, and to motivate further
work by illuminating gaps in the field where new research is needed