4,988 research outputs found
Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework
Hardware accelerations of deep learning systems have been extensively
investigated in industry and academia. The aim of this paper is to achieve
ultra-high energy efficiency and performance for hardware implementations of
deep neural networks (DNNs). An algorithm-hardware co-optimization framework is
developed, which is applicable to different DNN types, sizes, and application
scenarios. The algorithm part adopts the general block-circulant matrices to
achieve a fine-grained tradeoff between accuracy and compression ratio. It
applies to both fully-connected and convolutional layers and contains a
mathematically rigorous proof of the effectiveness of the method. The proposed
algorithm reduces computational complexity per layer from O() to O() and storage complexity from O() to O(), both for training and
inference. The hardware part consists of highly efficient Field Programmable
Gate Array (FPGA)-based implementations using effective reconfiguration, batch
processing, deep pipelining, resource re-using, and hierarchical control.
Experimental results demonstrate that the proposed framework achieves at least
152X speedup and 71X energy efficiency gain compared with IBM TrueNorth
processor under the same test accuracy. It achieves at least 31X energy
efficiency gain compared with the reference FPGA-based work.Comment: 6 figures, AAAI Conference on Artificial Intelligence, 201
FMCW rail-mounted SAR: Porting spotlight SAR imaging from MATLAB to FPGA
In this work, a low-cost laptop-based radar platform derived from the MIT open courseware has been implemented. It can perform ranging, Doppler measurement and SAR imaging using MATLAB as the processor. In this work, porting the signal processing algorithms onto a FPGA platform will be addressed as well as differences between results obtained using MATLAB and those obtained using the FPGA platform. The target FPGA platforms were a Virtex6 DSP kit and Spartan3A starter kit, the latter was also low-cost to further reduce the cost for students to access radar technology
Porting Spotlight Range Migration Algorithm Processor from Matlab to Virtex 6
This paper describes the implementation and optimization of a Synthetic Aperture Radar process Spotlight Range Migration Algorithm processor on FPGA Virtex 6 DSP kit that fits on the chip. The mean/max error compared to a software implementation is -54/-28.74dB for 55 elements and 882 samples
- …