Search CORE

4,341 research outputs found

YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration

Author: Andri Renzo
Benini Luca
Cavigelli Lukas
Rossi Davide
Publication venue
Publication date: 24/02/2017
Field of study

Convolutional neural networks (CNNs) have revolutionized the world of computer vision over the last few years, pushing image classification beyond human accuracy. The computational effort of today's CNNs requires power-hungry parallel processors or GP-GPUs. Recent developments in CNN accelerators for system-on-chip integration have reduced energy consumption significantly. Unfortunately, even these highly optimized devices are above the power envelope imposed by mobile and deeply embedded applications and face hard limitations caused by CNN weight I/O and storage. This prevents the adoption of CNNs in future ultra-low power Internet of Things end-nodes for near-sensor analytics. Recent algorithmic and theoretical advancements enable competitive classification accuracy even when limiting CNNs to binary (+1/-1) weights during training. These new findings bring major optimization opportunities in the arithmetic core by removing the need for expensive multiplications, as well as reducing I/O bandwidth and storage. In this work, we present an accelerator optimized for binary-weight CNNs that achieves 1510 GOp/s at 1.2 V on a core area of only 1.33 MGE (Million Gate Equivalent) or 0.19 mm

^2

and with a power dissipation of 895 {\mu}W in UMC 65 nm technology at 0.6 V. Our accelerator significantly outperforms the state-of-the-art in terms of energy and area efficiency achieving 61.2 TOp/s/[email protected] V and 1135 GOp/s/[email protected] V, respectively

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

On the Single Chip Implementation of a Hiperlan/2 and IEEE802.11a Capable Modem

Author: Dombrowski Kai
Fiebig Norbert
Grass Eckhard
Jagdhold Ulrich
Kraemer Rolf
Krueger Olaf
Lehman Jens
Lippert Gunther
Maehoenen Patrick
Maharatna Koushik
Tittelbach Klaus
Troya Alfonso
Publication venue
Publication date: 01/01/2001
Field of study

Southampton (e-Prints Soton)

Explore Bristol Research

Counting digital filters

Author: Zohar S.
Publication venue
Publication date: 08/05/1973
Field of study

Several embodiments of a counting digital filter of the non-recursive type are disclosed. In each embodiment two registers, at least one of which is a shift register, are included. The shift register received j sub x-bit data input words bit by bit. The kth data word is represented by the integer

NASA Technical Reports Server

Digital servo control of random sound test excitation

Author: Nakich R. B.
Publication venue
Publication date: 06/08/1974
Field of study

A digital servocontrol system for random noise excitation of a test object in a reverberant acoustic chamber employs a plurality of sensors spaced in the sound field to produce signals in separate channels which are decorrelated and averaged. The average signal is divided into a plurality of adjacent frequency bands cyclically sampled by a time division multiplex system, converted into digital form, and compared to a predetermined spectrum value stored in digital form. The results of the comparisons are used to control a time-shared up-down counter to develop gain control signals for the respective frequency bands in the spectrum of random sound energy picked up by the microphones

NASA Technical Reports Server

Spacecraft Microminiature PAM Decommutator System

Author
Publication venue
Publication date
Field of study

Operation and testing of spacecraft microminiature PAM decommutator syste

NASA Technical Reports Server

Implementation and Evaluation of Power Consumption of an Iris Pre-processing Algorithm on Modern FPGA

Author: Amiel F.
Blasinsky H.
Ea T.
Mikovicova B.
Rossant F.
Publication venue: Společnost pro radioelektronické inženýrství
Publication date: 01/12/2008
Field of study

In this article, the efficiency and applicability of several power reduction techniques applied on a modern 65nm FPGA is described. For image erosion and dilation algorithms, two major solutions were tested and compared with respect to power and energy consumption. Firstly the algorithm was run on a general purpose processor (gpp) NIOS and then hardware architecture of an Intellectual Property (IP) was designed. Furthermore IPs design was improved by applying a number of power optimization techniques. They involved RTL level clock gating, power driven synthesis with fitting and appropriate coding style. Results show that hardware implementation is much more energy efficient than a general purpose processor and power optimization schemes can reduce the overall power consumption on an FPGA

Directory of Open Access Journals

Digital library of Brno University of Technology

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

Author: Benini Luca
Conti Francesco
Gautschi Michael
Gürkaynak Frank Kagan
Haugou Germain
Loi Igor
Mangard Stefan
Muehlberghuber Michael
Pullini Antonio
Rossi Davide
Schiavone Pasquale Davide
Schilling Robert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2017
Field of study

Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Paper

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna