1,017 research outputs found

    Event-Driven Technologies for Reactive Motion Planning: Neuromorphic Stereo Vision and Robot Path Planning and Their Application on Parallel Hardware

    Get PDF
    Die Robotik wird immer mehr zu einem Schlüsselfaktor des technischen Aufschwungs. Trotz beeindruckender Fortschritte in den letzten Jahrzehnten, übertreffen Gehirne von Säugetieren in den Bereichen Sehen und Bewegungsplanung noch immer selbst die leistungsfähigsten Maschinen. Industrieroboter sind sehr schnell und präzise, aber ihre Planungsalgorithmen sind in hochdynamischen Umgebungen, wie sie für die Mensch-Roboter-Kollaboration (MRK) erforderlich sind, nicht leistungsfähig genug. Ohne schnelle und adaptive Bewegungsplanung kann sichere MRK nicht garantiert werden. Neuromorphe Technologien, einschließlich visueller Sensoren und Hardware-Chips, arbeiten asynchron und verarbeiten so raum-zeitliche Informationen sehr effizient. Insbesondere ereignisbasierte visuelle Sensoren sind konventionellen, synchronen Kameras bei vielen Anwendungen bereits überlegen. Daher haben ereignisbasierte Methoden ein großes Potenzial, schnellere und energieeffizientere Algorithmen zur Bewegungssteuerung in der MRK zu ermöglichen. In dieser Arbeit wird ein Ansatz zur flexiblen reaktiven Bewegungssteuerung eines Roboterarms vorgestellt. Dabei wird die Exterozeption durch ereignisbasiertes Stereosehen erreicht und die Pfadplanung ist in einer neuronalen Repräsentation des Konfigurationsraums implementiert. Die Multiview-3D-Rekonstruktion wird durch eine qualitative Analyse in Simulation evaluiert und auf ein Stereo-System ereignisbasierter Kameras übertragen. Zur Evaluierung der reaktiven kollisionsfreien Online-Planung wird ein Demonstrator mit einem industriellen Roboter genutzt. Dieser wird auch für eine vergleichende Studie zu sample-basierten Planern verwendet. Ergänzt wird dies durch einen Benchmark von parallelen Hardwarelösungen wozu als Testszenario Bahnplanung in der Robotik gewählt wurde. Die Ergebnisse zeigen, dass die vorgeschlagenen neuronalen Lösungen einen effektiven Weg zur Realisierung einer Robotersteuerung für dynamische Szenarien darstellen. Diese Arbeit schafft eine Grundlage für neuronale Lösungen bei adaptiven Fertigungsprozesse, auch in Zusammenarbeit mit dem Menschen, ohne Einbußen bei Geschwindigkeit und Sicherheit. Damit ebnet sie den Weg für die Integration von dem Gehirn nachempfundener Hardware und Algorithmen in die Industrierobotik und MRK

    A programmable triangular neighborhood function for a Kohonen self-organizing map implemented on chip

    Get PDF
    An efficient transistor level implementation of a flexible, programmable Triangular Function (TF) that can be used as a Triangular Neighborhood Function (TNF) in ultra-low power, self-organizing maps (SOMs) realized as Application-Specific Integrated Circuit (ASIC) is presented. The proposed TNF block is a component of a larger neighborhood mechanism, whose role is to determine the distance between the winning neuron and all neighboring neurons. Detailed simulations carried out for the software model of such network show that the TNF forms a good approximation of the Gaussian Neighborhood Function (GNF), while being implemented in a much easier way in hardware. The overall mechanism is very fast. In the CMOS 0.18 mu m technology, distances to all neighboring neurons are determined in parallel, within the time not exceeding 11 ns, for an example neighborhood range, R, of 15. The TNF blocks in particular neurons require another 6 ns to calculate the output values directly used in the adaptation process. This is also performed in parallel in all neurons. As a result, after determining the winning neuron, the entire map is ready for the adaptation after the time not exceeding 17 ns, even for large numbers of neurons. This feature allows for the realization of ultra low power SOMs, which are hundred times faster than similar SOMs realized on PC. The signal resolution at the output of the TNF block has a dominant impact on the overall energy consumption as well as the silicon area. Detailed system level simulations of the SOM show that even for low resolutions of 3 to 6 bits, the learning abilities of the SUM are not affected. The circuit performance has been verified by means of transistor level Hspice simulations carried out for different transistor models and different values of supply voltage and the environment temperature - a typical procedure completed in case of commercial chips that makes the obtained results reliable. (C) 2011 Elsevier Ltd. All rights reserved

    FPGA-Based Acceleration of the Self-Organizing Map (SOM) Algorithm using High-Level Synthesis

    Get PDF
    One of the fastest growing and the most demanding areas of computer science is Machine Learning (ML). Self-Organizing Map (SOM), categorized as unsupervised ML, is a popular data-mining algorithm widely used in Artificial Neural Network (ANN) for mapping high dimensional data into low dimensional feature maps. SOM, being computationally intensive, requires high computational time and power when dealing with large datasets. Acceleration of many computationally intensive algorithms can be achieved using Field-Programmable Gate Arrays (FPGAs) but it requires extensive hardware knowledge and longer development time when employing traditional Hardware Description Language (HDL) based design methodology. Open Computing Language (OpenCL) is a standard framework for writing parallel computing programs that execute on heterogeneous computing systems. Intel FPGA Software Development Kit for OpenCL (IFSO) is a High-Level Synthesis (HLS) tool that provides a more efficient alternative to HDL-based design. This research presents an optimized OpenCL implementation of SOM algorithm on Stratix V and Arria 10 FPGAs using IFSO. Compared to recent SOM implementations on Central Processing Unit (CPU) and Graphics Processing Unit (GPU), our OpenCL implementation on FPGAs provides superior speed performance and power consumption results. Stratix V achieves speedup of 1.41x - 16.55x compared to AMD and Intel CPU and 2.18x compared to Nvidia GPU whereas Arria 10 achieves speedup of 1.63x - 19.15x compared to AMD and Intel CPU and 2.52x compared to Nvidia GPU. In terms of power consumption, Stratix V is 35.53x and 42.53x whereas Arria 10 is 15.82x and 15.93x more power efficient compared to CPU and GPU respectively

    Concepção e realização de um framework para sistemas embarcados baseados em FPGA aplicado a um classificador Floresta de Caminhos Ótimos

    Get PDF
    Orientadores: Eurípedes Guilherme de Oliveira Nóbrega, Isabelle Fantoni-Coichot, Vincent FrémontTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Mecânica, Université de Technologie de CompiègneResumo: Muitas aplicações modernas dependem de métodos de Inteligência Artificial, tais como classificação automática. Entretanto, o alto custo computacional associado a essas técnicas limita seu uso em plataformas embarcadas com recursos restritos. Grandes quantidades de dados podem superar o poder computacional disponível em tais ambientes, o que torna o processo de projetá-los uma tarefa desafiadora. As condutas de processamento mais comuns usam muitas funções de custo computacional elevadas, o que traz a necessidade de combinar alta capacidade computacional com eficiência energética. Uma possível estratégia para superar essas limitações e prover poder computacional suficiente aliado ao baixo consumo de energia é o uso de hardware especializado como, por exemplo, FPGA. Esta classe de dispositivos é amplamente conhecida por sua boa relação desempenho/consumo, sendo uma alternativa interessante para a construção de sistemas embarcados eficazes e eficientes. Esta tese propõe um framework baseado em FPGA para a aceleração de desempenho de um algoritmo de classificação a ser implementado em um sistema embarcado. A aceleração do desempenho foi atingida usando o esquema de paralelização SIMD, aproveitando as características de paralelismo de grão fino dos FPGA. O sistema proposto foi implementado e testado em hardware FPGA real. Para a validação da arquitetura, um classificador baseado em Teoria dos Grafos, o OPF, foi avaliado em uma proposta de aplicação e posteriormente implementado na arquitetura proposta. O estudo do OPF levou à proposição de um novo algoritmo de aprendizagem para o mesmo, usando conceitos de Computação Evolutiva, visando a redução do tempo de processamento de classificação, que, combinada à implementação em hardware, oferece uma aceleração de desempenho suficiente para ser aplicada em uma variedade de sistemas embarcadosAbstract: Many modern applications rely on Artificial Intelligence methods such as automatic classification. However, the computational cost associated with these techniques limit their use in resource constrained embedded platforms. A high amount of data may overcome the computational power available in such embedded environments while turning the process of designing them a challenging task. Common processing pipelines use many high computational cost functions, which brings the necessity of combining high computational capacity with energy efficiency. One of the strategies to overcome this limitation and provide sufficient computational power allied with low energy consumption is the use of specialized hardware such as FPGA. This class of devices is widely known for their performance to consumption ratio, being an interesting alternative to building capable embedded systems. This thesis proposes an FPGA-based framework for performance acceleration of a classification algorithm to be implemented in an embedded system. Acceleration is achieved using SIMD-based parallelization scheme, taking advantage of FPGA characteristics of fine-grain parallelism. The proposed system is implemented and tested in actual FPGA hardware. For the architecture validation, a graph-based classifier, the OPF, is evaluated in an application proposition and afterward applied to the proposed architecture. The OPF study led to a proposition of a new learning algorithm using evolutionary computation concepts, aiming at classification processing time reduction, which combined to the hardware implementation offers sufficient performance acceleration to be applied in a variety of embedded systemsDoutoradoMecanica dos Sólidos e Projeto MecanicoDoutor em Engenharia Mecânica3077/2013-09CAPE

    A Decade of Neural Networks: Practical Applications and Prospects

    Get PDF
    The Jet Propulsion Laboratory Neural Network Workshop, sponsored by NASA and DOD, brings together sponsoring agencies, active researchers, and the user community to formulate a vision for the next decade of neural network research and application prospects. While the speed and computing power of microprocessors continue to grow at an ever-increasing pace, the demand to intelligently and adaptively deal with the complex, fuzzy, and often ill-defined world around us remains to a large extent unaddressed. Powerful, highly parallel computing paradigms such as neural networks promise to have a major impact in addressing these needs. Papers in the workshop proceedings highlight benefits of neural networks in real-world applications compared to conventional computing techniques. Topics include fault diagnosis, pattern recognition, and multiparameter optimization

    Event-based neuromorphic stereo vision

    Full text link

    Neural Network Adaptations to Hardware Implementations

    Get PDF
    In order to take advantage of the massive parallelism offered by artificial neural networks, hardware implementations are essential. However, most standard neural network models are not very suitable for implementation in hardware and adaptations are needed. In this section an overview is given of the various issues that are encountered when mapping an ideal neural network model onto a compact and reliable neural network hardware implementation, like quantization, handling nonuniformities and nonideal responses, and restraining computational complexity. Furthermore, a broad range of hardware-friendly learning rules is presented, which allow for simpler and more reliable hardware implementations. The relevance of these neural network adaptations to hardware is illustrated by their application in existing hardware implementations
    corecore