57 research outputs found

    On Real-Time AER 2-D Convolutions Hardware for Neuromorphic Spike-Based Cortical Processing

    Get PDF
    In this paper, a chip that performs real-time image convolutions with programmable kernels of arbitrary shape is presented. The chip is a first experimental prototype of reduced size to validate the implemented circuits and system level techniques. The convolution processing is based on the address–event-representation (AER) technique, which is a spike-based biologically inspired image and video representation technique that favors communication bandwidth for pixels with more information. As a first test prototype, a pixel array of 16x16 has been implemented with programmable kernel size of up to 16x16. The chip has been fabricated in a standard 0.35- m complimentary metal–oxide–semiconductor (CMOS) process. The technique also allows to process larger size images by assembling 2-D arrays of such chips. Pixel operation exploits low-power mixed analog–digital circuit techniques. Because of the low currents involved (down to nanoamperes or even picoamperes), an important amount of pixel area is devoted to mismatch calibration. The rest of the chip uses digital circuit techniques, both synchronous and asynchronous. The fabricated chip has been thoroughly tested, both at the pixel level and at the system level. Specific computer interfaces have been developed for generating AER streams from conventional computers and feeding them as inputs to the convolution chip, and for grabbing AER streams coming out of the convolution chip and storing and analyzing them on computers. Extensive experimental results are provided. At the end of this paper, we provide discussions and results on scaling up the approach for larger pixel arrays and multilayer cortical AER systems.Commission of the European Communities IST-2001-34124 (CAVIAR)Commission of the European Communities 216777 (NABAB)Ministerio de Educación y Ciencia TIC-2000-0406-P4Ministerio de Educación y Ciencia TIC-2003-08164-C03-01Ministerio de Educación y Ciencia TEC2006-11730-C03-01Junta de Andalucía TIC-141

    Reconfigurable Architectures and Systems for IoT Applications

    Get PDF
    abstract: Internet of Things (IoT) has become a popular topic in industry over the recent years, which describes an ecosystem of internet-connected devices or things that enrich the everyday life by improving our productivity and efficiency. The primary components of the IoT ecosystem are hardware, software and services. While the software and services of IoT system focus on data collection and processing to make decisions, the underlying hardware is responsible for sensing the information, preprocess and transmit it to the servers. Since the IoT ecosystem is still in infancy, there is a great need for rapid prototyping platforms that would help accelerate the hardware design process. However, depending on the target IoT application, different sensors are required to sense the signals such as heart-rate, temperature, pressure, acceleration, etc., and there is a great need for reconfigurable platforms that can prototype different sensor interfacing circuits. This thesis primarily focuses on two important hardware aspects of an IoT system: (a) an FPAA based reconfigurable sensing front-end system and (b) an FPGA based reconfigurable processing system. To enable reconfiguration capability for any sensor type, Programmable ANalog Device Array (PANDA), a transistor-level analog reconfigurable platform is proposed. CAD tools required for implementation of front-end circuits on the platform are also developed. To demonstrate the capability of the platform on silicon, a small-scale array of 24Ă—25 PANDA cells is fabricated in 65nm technology. Several analog circuit building blocks including amplifiers, bias circuits and filters are prototyped on the platform, which demonstrates the effectiveness of the platform for rapid prototyping IoT sensor interfaces. IoT systems typically use machine learning algorithms that run on the servers to process the data in order to make decisions. Recently, embedded processors are being used to preprocess the data at the energy-constrained sensor node or at IoT gateway, which saves considerable energy for transmission and bandwidth. Using conventional CPU based systems for implementing the machine learning algorithms is not energy-efficient. Hence an FPGA based hardware accelerator is proposed and an optimization methodology is developed to maximize throughput of any convolutional neural network (CNN) based machine learning algorithm on a resource-constrained FPGA.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Analog Photonics Computing for Information Processing, Inference and Optimisation

    Full text link
    This review presents an overview of the current state-of-the-art in photonics computing, which leverages photons, photons coupled with matter, and optics-related technologies for effective and efficient computational purposes. It covers the history and development of photonics computing and modern analogue computing platforms and architectures, focusing on optimization tasks and neural network implementations. The authors examine special-purpose optimizers, mathematical descriptions of photonics optimizers, and their various interconnections. Disparate applications are discussed, including direct encoding, logistics, finance, phase retrieval, machine learning, neural networks, probabilistic graphical models, and image processing, among many others. The main directions of technological advancement and associated challenges in photonics computing are explored, along with an assessment of its efficiency. Finally, the paper discusses prospects and the field of optical quantum computing, providing insights into the potential applications of this technology.Comment: Invited submission by Journal of Advanced Quantum Technologies; accepted version 5/06/202

    Computer vision algorithms on reconfigurable logic arrays

    Full text link

    Embracing Low-Power Systems with Improvement in Security and Energy-Efficiency

    Get PDF
    As the economies around the world are aligning more towards usage of computing systems, the global energy demand for computing is increasing rapidly. Additionally, the boom in AI based applications and services has already invited the pervasion of specialized computing hardware architectures for AI (accelerators). A big chunk of research in the industry and academia is being focused on providing energy efficiency to all kinds of power hungry computing architectures. This dissertation adds to these efforts. Aggressive voltage underscaling of chips is one the effective low power paradigms of providing energy efficiency. This dissertation identifies and deals with the reliability and performance problems associated with this paradigm and innovates novel energy efficient approaches. Specifically, the properties of a low power security primitive have been improved and, higher performance has been unlocked in an AI accelerator (Google TPU) in an aggressively voltage underscaled environment. And, novel power saving opportunities have been unlocked by characterizing the usage pattern of a baseline TPU with rigorous mathematical analysis

    Event-Driven Technologies for Reactive Motion Planning: Neuromorphic Stereo Vision and Robot Path Planning and Their Application on Parallel Hardware

    Get PDF
    Die Robotik wird immer mehr zu einem Schlüsselfaktor des technischen Aufschwungs. Trotz beeindruckender Fortschritte in den letzten Jahrzehnten, übertreffen Gehirne von Säugetieren in den Bereichen Sehen und Bewegungsplanung noch immer selbst die leistungsfähigsten Maschinen. Industrieroboter sind sehr schnell und präzise, aber ihre Planungsalgorithmen sind in hochdynamischen Umgebungen, wie sie für die Mensch-Roboter-Kollaboration (MRK) erforderlich sind, nicht leistungsfähig genug. Ohne schnelle und adaptive Bewegungsplanung kann sichere MRK nicht garantiert werden. Neuromorphe Technologien, einschließlich visueller Sensoren und Hardware-Chips, arbeiten asynchron und verarbeiten so raum-zeitliche Informationen sehr effizient. Insbesondere ereignisbasierte visuelle Sensoren sind konventionellen, synchronen Kameras bei vielen Anwendungen bereits überlegen. Daher haben ereignisbasierte Methoden ein großes Potenzial, schnellere und energieeffizientere Algorithmen zur Bewegungssteuerung in der MRK zu ermöglichen. In dieser Arbeit wird ein Ansatz zur flexiblen reaktiven Bewegungssteuerung eines Roboterarms vorgestellt. Dabei wird die Exterozeption durch ereignisbasiertes Stereosehen erreicht und die Pfadplanung ist in einer neuronalen Repräsentation des Konfigurationsraums implementiert. Die Multiview-3D-Rekonstruktion wird durch eine qualitative Analyse in Simulation evaluiert und auf ein Stereo-System ereignisbasierter Kameras übertragen. Zur Evaluierung der reaktiven kollisionsfreien Online-Planung wird ein Demonstrator mit einem industriellen Roboter genutzt. Dieser wird auch für eine vergleichende Studie zu sample-basierten Planern verwendet. Ergänzt wird dies durch einen Benchmark von parallelen Hardwarelösungen wozu als Testszenario Bahnplanung in der Robotik gewählt wurde. Die Ergebnisse zeigen, dass die vorgeschlagenen neuronalen Lösungen einen effektiven Weg zur Realisierung einer Robotersteuerung für dynamische Szenarien darstellen. Diese Arbeit schafft eine Grundlage für neuronale Lösungen bei adaptiven Fertigungsprozesse, auch in Zusammenarbeit mit dem Menschen, ohne Einbußen bei Geschwindigkeit und Sicherheit. Damit ebnet sie den Weg für die Integration von dem Gehirn nachempfundener Hardware und Algorithmen in die Industrierobotik und MRK

    DynaProg for Scala

    Get PDF
    Dynamic programming is an algorithmic technique to solve problems that follow the Bellman’s principle: optimal solutions depends on optimal sub-problem solutions. The core idea behind dynamic programming is to memoize intermediate results into matrices to avoid multiple computations. Solving a dynamic programming problem consists of two phases: filling one or more matrices with intermediate solutions for sub-problems and recomposing how the final result was constructed (backtracking). In textbooks, problems are usually described in terms of recurrence relations between matrices elements. Expressing dynamic programming problems in terms of recursive formulae involving matrix indices might be difficult, if often error prone, and the notation does not capture the essence of the underlying problem (for example aligning two sequences). Moreover, writing correct and efficient parallel implementation requires different competencies and often a significant amount of time. In this project, we present DynaProg, a language embedded in Scala (DSL) to address dynamic programming problems on heterogeneous platforms. DynaProg allows the programmer to write concise programs based on ADP [1], using a pair of parsing grammar and algebra; these program can then be executed either on CPU or on GPU. We evaluate the performance of our implementation against existing work and our own hand-optimized baseline implementations for both the CPU and GPU versions. Experimental results show that plain Scala has a large overhead and is recommended to be used with small sequences (≤1024) whereas the generated GPU version is comparable with existing implementations: matrix chain multiplication has the same performance as our hand-optimized version (142% of the execution time of [2]) for a sequence of 4096 matrices, Smith-Waterman is twice slower than [3] on a pair of sequences of 6144 elements, and RNA folding is on par with [4] (95% running time) for sequences of 4096 elements. [1] Robert Giegerich and Carsten Meyer. Algebraic Dynamic Programming. [2] Chao-Chin Wu, Jenn-Yang Ke, Heshan Lin and Wu Chun Feng. Optimizing dynamic programming on graphics processing units via adaptive thread-level parallelism. [3] Edans Flavius de O. Sandes, Alba Cristina M. A. de Melo. Smith-Waterman alignment of huge sequences with GPU in linear space. [4] Guillaume Rizk and Dominique Lavenier. GPU accelerated RNA folding algorithm

    High speed event-based visual processing in the presence of noise

    Get PDF
    Standard machine vision approaches are challenged in applications where large amounts of noisy temporal data must be processed in real-time. This work aims to develop neuromorphic event-based processing systems for such challenging, high-noise environments. The novel event-based application-focused algorithms developed are primarily designed for implementation in digital neuromorphic hardware with a focus on noise robustness, ease of implementation, operationally useful ancillary signals and processing speed in embedded systems

    Object Recognition

    Get PDF
    Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs
    • …
    corecore