3,041 research outputs found

    Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

    Full text link
    Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated using both GPUs and FPGAs. Our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art performance, while offering a >16-fold improvement in power consumption, compared to conventional GPU-accelerated networks. Both our FPGA-accelerated determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10 by >9.89x and >9.91x, respectively.Comment: 4 pages, 3 figures, 1 tabl

    A High Performance Fuzzy Logic Architecture for UAV Decision Making

    Get PDF
    The majority of Unmanned Aerial Vehicles (UAVs) in operation today are not truly autonomous, but are instead reliant on a remote human pilot. A high degree of autonomy can provide many advantages in terms of cost, operational resources and safety. However, one of the challenges involved in achieving autonomy is that of replicating the reasoning and decision making capabilities of a human pilot. One candidate method for providing this decision making capability is fuzzy logic. In this role, the fuzzy system must satisfy real-time constraints, process large quantities of data and relate to large knowledge bases. Consequently, there is a need for a generic, high performance fuzzy computation platform for UAV applications. Based on Lees’ [1] original work, a high performance fuzzy processing architecture, implemented in Field Programmable Gate Arrays (FPGAs), has been developed and is shown to outclass the performance of existing fuzzy processors

    FPGA-based conformance testing and system prototyping of an MPEG-4 SA-DCT hardware accelerator

    Get PDF
    Two FPGA implementations of a shape adaptive discrete cosine transform (SA-DCT) accelerator are presented in this paper: one PCI-based and the other AMBA-based. The former is used for conformance testing with the MPEG-4 standard requirements. The latter is an alternative platform for system prototyping and has an architecture more representative of a mobile device. The proposed accelerator meets real time constraints on both platforms with a gate count of approximately 40k, and outperforms the optimised reference software implementation by 20/spl times/. It is estimated that the accelerator consumes 250mW on a Virtex-E FPGA and 79mW on a Virtex-II FPGA in the worst case scenario

    Memory and information processing in neuromorphic systems

    Full text link
    A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system
    • …
    corecore