Search CORE

1,991 research outputs found

ACE16K: The Third Generation of Mixed-Signal SIMD-CNN ACE Chips Toward VSoCs

Author: Carmona Galán Ricardo
Carranza González Luis
Domínguez Castro Rafael
Espejo Meana Servando Carlos
Jiménez Garrido Francisco José
Liñán Cembrano Gustavo
Roca Moreno Elisenda
Rodríguez Vázquez Ángel Benito
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Today, with 0.18-μm technologies mature and stable enough for mixed-signal design with a large variety of CMOS compatible optical sensors available and with 0.09-μm technologies knocking at the door of designers, we can face the design of integrated systems, instead of just integrated circuits. In fact, significant progress has been made in the last few years toward the realization of vision systems on chips (VSoCs). Such VSoCs are eventually targeted to integrate within a semiconductor substrate the functions of optical sensing, image processing in space and time, high-level processing, and the control of actuators. The consecutive generations of ACE chips define a roadmap toward flexible VSoCs. These chips consist of arrays of mixed-signal processing elements (PEs) which operate in accordance with single instruction multiple data (SIMD) computing architectures and exhibit the functional features of CNN Universal Machines. They have been conceived to cover the early stages of the visual processing path in a fully-parallel manner, and hence more efficiently than DSP-based systems. Across the different generations, different improvements and modifications have been made looking to converge with the newest discoveries of neurobiologists regarding the behavior of natural retinas. This paper presents considerations pertaining to the design of a member of the third generation of ACE chips, namely to the so-called ACE16k chip. This chip, designed in a 0.35-μm standard CMOS technology, contains about 3.75 million transistors and exhibits peak computing figures of 330 GOPS, 3.6 GOPS/mm2 and 82.5 GOPS/W. Each PE in the array contains a reconfigurable computing kernel capable of calculating linear convolutions on 3×3 neighborhoods in less than 1.5 μs, imagewise Boolean combinations in less than 200 ns, imagewise arithmetic operations in about 5 μs, and CNN-like temporal evolutions with a time constant of about 0.5 μs. Unfortunately, the many ideas underlying the design of this chip cannot be covered in a single paper; hence, this paper is focused on, first, placing the ACE16k in the ACE chip roadmap and, then, discussing the most significant modifications of ACE16K versus its predecessors in the family.LOCUST IST2001—38 097VISTA TIC2003—09 817 - C02—01Office of Naval Research N000 140 210 88

idUS. Depósito de Investigación Universidad de Sevilla

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

Author: Bouganis Christos-Savvas
Kouris Alexandros
Venieris Stylianos I.
Publication venue
Publication date: 19/02/2018
Field of study

In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

CMOS Vision Sensors: Embedding Computer Vision at Imaging Front-Ends

Author: Carmona Galán Ricardo
Fernández Berni Jorge
Leñero Bardallo Juan Antonio
Rodríguez Vázquez Ángel Benito
Vornicu Ion
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

CMOS Image Sensors (CIS) are key for imaging technol-ogies. These chips are conceived for capturing opticalscenes focused on their surface, and for delivering elec-trical images, commonly in digital format. CISs may incor-porate intelligence; however, their smartness basicallyconcerns calibration, error correction and other similartasks. The term CVISs (CMOS VIsion Sensors) definesother class of sensor front-ends which are aimed at per-forming vision tasks right at the focal plane. They havebeen running under names such as computational imagesensors, vision sensors and silicon retinas, among others. CVIS and CISs are similar regarding physical imple-mentation. However, while inputs of both CIS and CVISare images captured by photo-sensors placed at thefocal-plane, CVISs primary outputs may not be imagesbut either image features or even decisions based on thespatial-temporal analysis of the scenes. We may hencestate that CVISs are more “intelligent” than CISs as theyfocus on information instead of on raw data. Actually,CVIS architectures capable of extracting and interpretingthe information contained in images, and prompting reac-tion commands thereof, have been explored for years inacademia, and industrial applications are recently ramp-ing up.One of the challenges of CVISs architects is incorporat-ing computer vision concepts into the design flow. Theendeavor is ambitious because imaging and computervision communities are rather disjoint groups talking dif-ferent languages. The Cellular Nonlinear Network Univer-sal Machine (CNNUM) paradigm, proposed by Profs.Chua and Roska, defined an adequate framework forsuch conciliation as it is particularly well suited for hard-ware-software co-design [1]-[4]. This paper overviewsCVISs chips that were conceived and prototyped at IMSEVision Lab over the past twenty years. Some of them fitthe CNNUM paradigm while others are tangential to it. Allthem employ per-pixel mixed-signal processing circuitryto achieve sensor-processing concurrency in the quest offast operation with reduced energy budget.Junta de Andalucía TIC 2012-2338Ministerio de Economía y Competitividad TEC 2015-66878-C3-1-R y TEC 2015-66878-C3-3-

idUS. Depósito de Investigación Universidad de Sevilla

A micropower centroiding vision processor

Author: Constandinou T G
Toumazou C
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2006
Field of study

Published versio

Spiral - Imperial College Digital Repository

A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones

Author: Benini Luca
Conti Francesco
Flamand Eric
Loquercio Antonio
Palossi Daniele
Scaramuzza Davide
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/08/2018
Field of study

Fully-autonomous miniaturized robots (e.g., drones), with artificial intelligence (AI) based visual navigation capabilities are extremely challenging drivers of Internet-of-Things edge intelligence capabilities. Visual navigation based on AI approaches, such as deep neural networks (DNNs) are becoming pervasive for standard-size drones, but are considered out of reach for nanodrones with size of a few cm

{}^\mathrm{2}

. In this work, we present the first (to the best of our knowledge) demonstration of a navigation engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based visual navigation. To achieve this goal we developed a complete methodology for parallel execution of complex DNNs directly on-bard of resource-constrained milliwatt-scale nodes. Our system is based on GAP8, a novel parallel ultra-low-power computing platform, and a 27 g commercial, open-source CrazyFlie 2.0 nano-quadrotor. As part of our general methodology we discuss the software mapping techniques that enable the state-of-the-art deep convolutional neural network presented in [1] to be fully executed on-board within a strict 6 fps real-time constraint with no compromise in terms of flight results, while all processing is done with only 64 mW on average. Our navigation engine is flexible and can be used to span a wide performance range: at its peak performance corner it achieves 18 fps while still consuming on average just 3.5% of the power envelope of the deployed nano-aircraft.Comment: 15 pages, 13 figures, 5 tables, 2 listings, accepted for publication in the IEEE Internet of Things Journal (IEEE IOTJ

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

ZORA

Integrated Circuitry to Detect Slippage Inspired by Human Skin and Artificial Retinas

Author: Liñán Cembrano Gustavo
Maldonado López Rocío
Rodríguez Vázquez Ángel Benito
Vidal Verdú Fernando
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

This paper presents a bioinspired integrated tactile coprocessor that is able to generate a warning in the case of slippage via the data provided by a tactile sensor. Some implementations use different layers of piezoresistive and piezoelectric materials to build upon the raw sensor and obtain the static (pressure) as well as the dynamic (slippage) information. In this paper, a simple raw sensor is used, and a circuitry is implemented, which is able to extract the dynamic information from a single piezoresistive layer. The circuitry was inspired by structures found in human skin and retina, as they are biological systems made up of a dense network of receptors. It is largely based on an artificial retina , which is able to detect motion by using relatively simple spatial temporal dynamics. The circuitry was adapted to respond in the bandwidth of microvibrations produced by early slippage, resembling human skin. Experimental measurements from a chip implemented in a 0.35-mum four-metal two-poly standard CMOS process are presented to show both the performance of the building blocks included in each processing node and the operation of the whole system as a detector of early slippage.Ministerio de Economía y Competitividad TEC2006-12376-C02-01Gobierno de España TEC2006- 1572

idUS. Depósito de Investigación Universidad de Sevilla