20 research outputs found

    Bio-inspired foveal and peripheral visual sensing for saliency-based decision making in robotics

    Get PDF
    Computer vision is an area of research that has grown at immense speed in the last few decades, tackling problems towards scene understanding from very diverse fronts, such as image classification, object detection, localization, mapping and tracking. It has also been long understood that there are very valuable lessons to learn from biology and to be applied to this research field, where the human visual system is very likely the most studied brain mechanism. The eye foveation system is a very good example of such lessons, since both machines and animals often face a similar dilemma; to prioritize visual areas of interest to faster process information, given limited computing power and from a field of view that is too wide to be simultaneously attended. While extensive models of artificial foveation have been presented, the re-emerging area of machine learning with deep neural networks has opened the question into how these two approaches can contribute to each other. Novel deep learning models often rely on the availability of substantial computing power, but areas of application face strict constraints, a good example are unmanned aerial vehicles, which in order to be autonomous should lift and power all their computing equipment. In this work it is studied how applying a foveation principle to down-scale images can be used to reduce the number of operations required for object detection, and compare its effect to normally down-sampled images, given the prevalent number of operations by Convolutional Neural Network (CNN) layers. Foveation requires prior knowledge of regions of interest to center the fovea, this point in question is addressed by a merging of bottom-up saliency and top-down feedback of objects that the CNN has been trained to detect. Albeit saliency models have also been studied extensively in the last couple of decades, most often comparing their performance to human observer datasets, the question remains open into how they fit in wider information processing paradigms and into functional representations of the human brain. It is proposed here an information flow scheme that encompasses these principles. Finally, to give to the model the capacity to operate coherently in the time domain, it adapts a representation of a well-established theory of the decision-making process that takes place in the basal ganglia region of the brain. The behaviour of this representation is then tested against human observer's data in an omnidirectional field of view, where the importance of selecting the most contextually relevant region of interest in each time-step is highlighted

    Energy Efficient Neocortex-Inspired Systems with On-Device Learning

    Get PDF
    Shifting the compute workloads from cloud toward edge devices can significantly improve the overall latency for inference and learning. On the contrary this paradigm shift exacerbates the resource constraints on the edge devices. Neuromorphic computing architectures, inspired by the neural processes, are natural substrates for edge devices. They offer co-located memory, in-situ training, energy efficiency, high memory density, and compute capacity in a small form factor. Owing to these features, in the recent past, there has been a rapid proliferation of hybrid CMOS/Memristor neuromorphic computing systems. However, most of these systems offer limited plasticity, target either spatial or temporal input streams, and are not demonstrated on large scale heterogeneous tasks. There is a critical knowledge gap in designing scalable neuromorphic systems that can support hybrid plasticity for spatio-temporal input streams on edge devices. This research proposes Pyragrid, a low latency and energy efficient neuromorphic computing system for processing spatio-temporal information natively on the edge. Pyragrid is a full-scale custom hybrid CMOS/Memristor architecture with analog computational modules and an underlying digital communication scheme. Pyragrid is designed for hierarchical temporal memory, a biomimetic sequence memory algorithm inspired by the neocortex. It features a novel synthetic synapses representation that enables dynamic synaptic pathways with reduced memory usage and interconnects. The dynamic growth in the synaptic pathways is emulated in the memristor device physical behavior, while the synaptic modulation is enabled through a custom training scheme optimized for area and power. Pyragrid features data reuse, in-memory computing, and event-driven sparse local computing to reduce data movement by ~44x and maximize system throughput and power efficiency by ~3x and ~161x over custom CMOS digital design. The innate sparsity in Pyragrid results in overall robustness to noise and device failure, particularly when processing visual input and predicting time series sequences. Porting the proposed system on edge devices can enhance their computational capability, response time, and battery life

    Solid State Circuits Technologies

    Get PDF
    The evolution of solid-state circuit technology has a long history within a relatively short period of time. This technology has lead to the modern information society that connects us and tools, a large market, and many types of products and applications. The solid-state circuit technology continuously evolves via breakthroughs and improvements every year. This book is devoted to review and present novel approaches for some of the main issues involved in this exciting and vigorous technology. The book is composed of 22 chapters, written by authors coming from 30 different institutions located in 12 different countries throughout the Americas, Asia and Europe. Thus, reflecting the wide international contribution to the book. The broad range of subjects presented in the book offers a general overview of the main issues in modern solid-state circuit technology. Furthermore, the book offers an in depth analysis on specific subjects for specialists. We believe the book is of great scientific and educational value for many readers. I am profoundly indebted to the support provided by all of those involved in the work. First and foremost I would like to acknowledge and thank the authors who worked hard and generously agreed to share their results and knowledge. Second I would like to express my gratitude to the Intech team that invited me to edit the book and give me their full support and a fruitful experience while working together to combine this book

    A 4K-Input High-Speed Winner-Take-All (WTA) Circuit with Single-Winner Selection for Change-Driven Vision Sensors

    No full text
    Winner-Take-All (WTA) circuits play an important role in applications where a single element must be selected according to its relevance. They have been successfully applied in neural networks and vision sensors. These applications usually require a large number of inputs for the WTA circuit, especially for vision applications where thousands to millions of pixels may compete to be selected. WTA circuits usually exhibit poor response-time scaling with the number of competitors, and most of the current WTA implementations are designed to work with less than 100 inputs. Another problem related to the large number of inputs is the difficulty to select just one winner, since many competitors may have differences below the WTA resolution. In this paper, a WTA circuit is presented that handles more than four thousand inputs, to our best knowledge the hitherto largest WTA, with response times below the microsecond, and with a guaranty of just a single winner selection. This performance is obtained by the combination of a standard analog WTA circuit and a fast digital single-winner selector with almost no size penalty. This WTA circuit has been successfully employed in the fabrication of a Selective Change-Driven Vision Sensor based on 180 nm CMOS technology. Both simulated and experimental results are presented in the paper, showing that a single pixel event can be selected in just 560 ns, and a multipixel pixel event can be processed in 100 μs. Similar results with a conventional approach would require a camera working at more than 1 Mfps for the single-pixel event detection, and 10 kfps for the whole multipixel event to be processed

    Hardware neural systems for applications: a pulsed analog approach

    Get PDF

    A Flexible, Low-Power, Programmable Unsupervised Neural Network Based on Microcontrollers for Medical Applications

    Get PDF
    We present an implementation and laboratory tests of a winner takes all (WTA) artificial neural network (NN) on two microcontrollers (μC) with the ARM Cortex M3 and the AVR cores. The prospective application of this device is in wireless body sensor network (WBSN) in an on-line analysis of electrocardiograph (ECG) and electromyograph (EMG) biomedical signals. The proposed device will be used as a base station in the WBSN, acquiring and analysing the signals from the sensors placed on the human body. The proposed system is equiped with an analog-todigital converter (ADC), and allows for multi-channel acquisition of analog signals, preprocessing (filtering) and further analysis

    CMOS optical centroid processor for an integrated Shack-Hartmann wavefront sensor

    Get PDF
    A Shack Hartmann wavefront sensor is used to detect the distortion of light in an optical wavefront. It does this by sampling the wavefront with an array of lenslets and measuring the displacement of focused spots from reference positions. These displacements are linearly related to the local wavefront tilts from which the entire wavefront can be reconstructed. In most Shack Hartmann wavefront sensors, a CCD is used to sample the entire wavefront, typically at a rate of 25 to 60 Hz, and a whole frame of light spots is read out before their positions are processed. This results in a data bottleneck. In this design, parallel processing is achieved by incorporating local centroid processing for each focused spot, thereby requiring only reduced bandwidth data to be transferred off-chip at a high rate. To incorporate centroid processing at the sensor level requires high levels of circuit integration not possible with a CCD technology. Instead a standard 0.7J..lmCMOS technology was used but photodetector structures for this technology are not well characterised. As such characterisation of several common photodiode structures was carried out which showed good responsitivity of the order of 0.3 AIW. Prior to fabrication on-chip, a hardware emulation system using a reprogrammable FPGA was built which implemented the centroiding algorithm successfully. Subsequently, the design was implemented as a single-chip CMOS solution. The fabricated optical centroid processor successfully computed and transmitted the centroids at a rate of more than 2.4 kHz, which when integrated as an array of tilt sensors will allow a data rate that is independent of the number of tilt sensors' employed. Besides removing the data bottleneck present in current systems, the design also offers advantages in terms of power consumption, system size and cost. The design was also shown to be extremely scalable to a complete low cost real time adaptive optics system

    CMOS optical centroid processor for an integrated Shack-Hartmann wavefront sensor

    Get PDF
    A Shack Hartmann wavefront sensor is used to detect the distortion of light in an optical wavefront. It does this by sampling the wavefront with an array of lenslets and measuring the displacement of focused spots from reference positions. These displacements are linearly related to the local wavefront tilts from which the entire wavefront can be reconstructed. In most Shack Hartmann wavefront sensors, a CCD is used to sample the entire wavefront, typically at a rate of 25 to 60 Hz, and a whole frame of light spots is read out before their positions are processed. This results in a data bottleneck. In this design, parallel processing is achieved by incorporating local centroid processing for each focused spot, thereby requiring only reduced bandwidth data to be transferred off-chip at a high rate. To incorporate centroid processing at the sensor level requires high levels of circuit integration not possible with a CCD technology. Instead a standard 0.7J..lmCMOS technology was used but photodetector structures for this technology are not well characterised. As such characterisation of several common photodiode structures was carried out which showed good responsitivity of the order of 0.3 AIW. Prior to fabrication on-chip, a hardware emulation system using a reprogrammable FPGA was built which implemented the centroiding algorithm successfully. Subsequently, the design was implemented as a single-chip CMOS solution. The fabricated optical centroid processor successfully computed and transmitted the centroids at a rate of more than 2.4 kHz, which when integrated as an array of tilt sensors will allow a data rate that is independent of the number of tilt sensors' employed. Besides removing the data bottleneck present in current systems, the design also offers advantages in terms of power consumption, system size and cost. The design was also shown to be extremely scalable to a complete low cost real time adaptive optics system

    Real-Time High-Resolution Multiple-Camera Depth Map Estimation Hardware and Its Applications

    Get PDF
    Depth information is used in a variety of 3D based signal processing applications such as autonomous navigation of robots and driving systems, object detection and tracking, computer games, 3D television, and free view-point synthesis. These applications require high accuracy and speed performances for depth estimation. Depth maps can be generated using disparity estimation methods, which are obtained from stereo matching between multiple images. The computational complexity of disparity estimation algorithms and the need of large size and bandwidth for the external and internal memory make the real-time processing of disparity estimation challenging, especially for high resolution images. This thesis proposes a high-resolution high-quality multiple-camera depth map estimation hardware. The proposed hardware is verified in real-time with a complete system from the initial image capture to the display and applications. The details of the complete system are presented. The proposed binocular and trinocular adaptive window size disparity estimation algorithms are carefully designed to be suitable to real-time hardware implementation by allowing efficient parallel and local processing while providing high-quality results. The proposed binocular and trinocular disparity estimation hardware implementations can process 55 frames per second on a Virtex-7 FPGA at a 1024 x 768 XGA video resolution for a 128 pixel disparity range. The proposed binocular disparity estimation hardware provides best quality compared to existing real-time high-resolution disparity estimation hardware implementations. A novel compressed-look up table based rectification algorithm and its real-time hardware implementation are presented. The low-complexity decompression process of the rectification hardware utilizes a negligible amount of LUT and DFF resources of the FPGA while it does not require the existence of external memory. The first real-time high-resolution free viewpoint synthesis hardware utilizing three-camera disparity estimation is presented. The proposed hardware generates high-quality free viewpoint video in real-time for any horizontally aligned arbitrary camera positioned between the leftmost and rightmost physical cameras. The full embedded system of the depth estimation is explained. The presented embedded system transfers disparity results together with synchronized RGB pixels to the PC for application development. Several real-time applications are developed on a PC using the obtained RGB+D results. The implemented depth estimation based real-time software applications are: depth based image thresholding, speed and distance measurement, head-hands-shoulders tracking, virtual mouse using hand tracking and face tracking integrated with free viewpoint synthesis. The proposed binocular disparity estimation hardware is implemented in an ASIC. The ASIC implementation of disparity estimation imposes additional constraints with respect to the FPGA implementation. These restrictions, their implemented efficient solutions and the ASIC implementation results are presented. In addition, a very high-resolution (82.3 MP) 360°x90° omnidirectional multiple camera system is proposed. The hemispherical camera system is able to view the target locations close to horizontal plane with more than two cameras. Therefore, it can be used in high-resolution 360° depth map estimation and its applications in the future
    corecore