3,041 research outputs found
Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL
Recent technological advances have proliferated the available computing
power, memory, and speed of modern Central Processing Units (CPUs), Graphics
Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs).
Consequently, the performance and complexity of Artificial Neural Networks
(ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs)
currently offer state-of-the-art performance, they consume large amounts of
power. Training such networks on CPUs is inefficient, as data throughput and
parallel computation is limited. FPGAs are considered a suitable candidate for
performance critical, low power systems, e.g. the Internet of Things (IOT) edge
devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development
environment, networks described using the high-level OpenCL framework can be
accelerated on heterogeneous platforms. Moreover, the resource utilization and
power consumption of DNNs can be further enhanced by utilizing regularization
techniques that binarize network weights. In this paper, we introduce, to the
best of our knowledge, the first FPGA-accelerated stochastically binarized DNN
implementations, and compare them to implementations accelerated using both
GPUs and FPGAs. Our developed networks are trained and benchmarked using the
popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art
performance, while offering a >16-fold improvement in power consumption,
compared to conventional GPU-accelerated networks. Both our FPGA-accelerated
determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10
by >9.89x and >9.91x, respectively.Comment: 4 pages, 3 figures, 1 tabl
A High Performance Fuzzy Logic Architecture for UAV Decision Making
The majority of Unmanned Aerial Vehicles (UAVs) in operation today are not truly autonomous, but are instead reliant on a remote human pilot. A high degree of autonomy can provide many advantages in terms of cost, operational resources and safety. However, one of the challenges involved in achieving autonomy is that of replicating the reasoning and decision making capabilities of a human pilot. One candidate method for providing this decision making capability is fuzzy logic. In this role, the fuzzy system must satisfy real-time constraints, process large quantities of data and relate to large knowledge bases. Consequently, there is a need for a generic, high performance fuzzy computation platform for UAV applications. Based on Lees’ [1] original work, a high performance fuzzy processing architecture, implemented in Field Programmable Gate Arrays (FPGAs), has been developed and is shown to outclass the performance of existing fuzzy processors
FPGA-based conformance testing and system prototyping of an MPEG-4 SA-DCT hardware accelerator
Two FPGA implementations of a shape adaptive discrete cosine transform (SA-DCT) accelerator are presented in this paper: one PCI-based and the other AMBA-based. The former is used for conformance testing with the MPEG-4 standard requirements. The latter is an alternative platform for system prototyping and has an architecture more representative of a mobile device. The proposed accelerator meets real time constraints on both platforms with a gate count of approximately 40k, and outperforms the optimised reference software implementation by 20/spl times/. It is estimated that the accelerator consumes 250mW on a Virtex-E FPGA and 79mW on a Virtex-II FPGA in the worst case scenario
Memory and information processing in neuromorphic systems
A striking difference between brain-inspired neuromorphic processors and
current von Neumann processors architectures is the way in which memory and
processing is organized. As Information and Communication Technologies continue
to address the need for increased computational power through the increase of
cores within a digital processor, neuromorphic engineers and scientists can
complement this need by building processor architectures where memory is
distributed with the processing. In this paper we present a survey of
brain-inspired processor architectures that support models of cortical networks
and deep neural networks. These architectures range from serial clocked
implementations of multi-neuron systems to massively parallel asynchronous ones
and from purely digital systems to mixed analog/digital systems which implement
more biological-like models of neurons and synapses together with a suite of
adaptation and learning mechanisms analogous to the ones found in biological
nervous systems. We describe the advantages of the different approaches being
pursued and present the challenges that need to be addressed for building
artificial neural processing systems that can display the richness of behaviors
seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed
neuromorphic computing platforms and system
- …