121 research outputs found

    Image Feature Extraction Acceleration

    Get PDF
    Image feature extraction is instrumental for most of the best-performing algorithms in computer vision. However, it is also expensive in terms of computational and memory resources for embedded systems due to the need of dealing with individual pixels at the earliest processing levels. In this regard, conventional system architectures do not take advantage of potential exploitation of parallelism and distributed memory from the very beginning of the processing chain. Raw pixel values provided by the front-end image sensor are squeezed into a high-speed interface with the rest of system components. Only then, after deserializing this massive dataflow, parallelism, if any, is exploited. This chapter introduces a rather different approach from an architectural point of view. We present two Application-Specific Integrated Circuits (ASICs) where the 2-D array of photo-sensitive devices featured by regular imagers is combined with distributed memory supporting concurrent processing. Custom circuitry is added per pixel in order to accelerate image feature extraction right at the focal plane. Specifically, the proposed sensing-processing chips aim at the acceleration of two flagships algorithms within the computer vision community: the Viola-Jones face detection algorithm and the Scale Invariant Feature Transform (SIFT). Experimental results prove the feasibility and benefits of this architectural solution.Ministerio de Economía y Competitividad TEC2012-38921-C02, IPT-2011- 1625-430000, IPC-20111009Junta de Andalucía TIC 2338-2013Xunta de Galicia EM2013/038Office of NavalResearch (USA) N00014141035

    Pixels for focal-plane scale space generation and for high dynamic range imaging

    Get PDF
    Focal-plane processing allows for parallel processing throughout the entire pixel matrix, which can help increasing the speed of vision systems. The fabrication of circuits inside the pixel matrix increases the pixel pitch and reduces the fill factor, which leads to reduced image quality. To take advantage of the focal-plane processing capabilities and minimize image quality reduction, we first consider the inclusion of only two extra transistors in the pixel, allowing for scale space generation at the focal plane. We assess the conditions in which the proposed circuitry is advantageous. We perform a time and energy analysis of this approach in comparison to a digital solution. Considering that a SAR ADC per column is used and the clock frequency is equal to 5.6 MHz, the proposed analysis show that the focal-plane approach is 26 times faster if the digital solution uses 10 processing elements, and 49 times more energy-efficient. Another way of taking advantage of the focal-plane signal processing capability is by using focal-plane processing for increasing image quality itself, such as in the case of high dynamic range imaging pixels. This work also presents the design and study of a pixel that captures high dynamic range images by sensing the matrix average luminance, and then adjusting the integration time of each pixel according to the global average and to the local value of the pixel. This pixel was implemented considering small structural variations, such as different photodiode sizes for global average luminance measurement. Schematic and post-layout simulations were performed with the implemented pixel using an input image of 76 dB, presenting results with details in both dark and bright image areas.O processamento no plano focal de imageadores permite que a imagem capturada seja processada em paralelo por toda a matrix de pixels, característica que pode aumentar a velocidade de sistemas de visão. Ao fabricar circuitos dentro da matrix de pixels, o tamanho do pixel aumenta e a razão entre área fotossensível e a área total do pixel diminui, reduzindo a qualidade da imagem. Para utilizar as vantagens do processamento no plano focal e minimizar a redução da qualidade da imagem, a primeira parte da tese propõe a inclusão de dois transistores no pixel, o que permite que o espaço de escalas da imagem capturada seja gerado. Nós então avaliamos em quais condições o circuito proposto é vantajoso. Nós analisamos o tempo de processamento e o consumo de energia dessa proposta em comparação com uma solução digital. Utilizando um conversor de aproximações sucessivas com frequência de 5.6 MHz, a análise proposta mostra que a abordagem no plano focal é 26 vezes mais rápida que o circuito digital com 10 elementos de processamento, e consome 49 vezes menos energia. Outra maneira de utilizar processamento no plano focal consiste em aplicá-lo para melhorar a qualidade da imagem, como na captura de imagens em alta faixa dinâmica. Esta tese também apresenta o estudo e projeto de um pixel que realiza a captura de imagens em alta faixa dinâmica através do ajuste do tempo de integração de cada pixel utilizando a iluminação média e o valor do próprio pixel. Esse pixel foi projetado considerando pequenas variações estruturais, como diferentes tamanhos do fotodiodo que realiza a captura do valor de iluminação médio. Simulações de esquemático e pós-layout foram realizadas com o pixel projetado utilizando uma imagem com faixa dinâmica de 76 dB, apresentando resultados com detalhes tanto na parte clara como na parte escura da imagem

    A Wearable Indoor Navigation System for Blind and Visually Impaired Individuals

    Get PDF
    Indoor positioning and navigation for blind and visually impaired individuals has become an active field of research. The development of a reliable positioning and navigational system will reduce the suffering of the people with visual disabilities, help them live more independently, and promote their employment opportunities. In this work, a coarse-to-fine multi-resolution model is proposed for indoor navigation in hallway environments based on the use of a wearable computer called the eButton. This self-constructed device contains multiple sensors which are used for indoor positioning and localization in three layers of resolution: a global positioning system (GPS) layer for building identification; a Wi-Fi - barometer layer for rough position localization; and a digital camera - motion sensor layer for precise localization. In this multi-resolution model, a new theoretical framework is developed which uses the change of atmospheric pressure to determine the floor number in a multistory building. The digital camera and motion sensors within the eButton acquire both pictorial and motion data as a person with a normal vision walks along a hallway to establish a database. Precise indoor positioning and localization information is provided to the visually impaired individual based on a Kalman filter fusion algorithm and an automatic matching algorithm between the acquired images and those in the pre-established database. Motion calculation is based on the data from motion sensors is used to refine the localization result. Experiments were conducted to evaluate the performance of the algorithms. Our results show that the new device and algorithms can precisely determine the floor level and indoor location along hallways in multistory buildings, providing a powerful and unobtrusive navigational tool for blind and visually impaired individuals

    Dynamically reconfigurable architecture for embedded computer vision systems

    Get PDF
    The objective of this research work is to design, develop and implement a new architecture which integrates on the same chip all the processing levels of a complete Computer Vision system, so that the execution is efficient without compromising the power consumption while keeping a reduced cost. For this purpose, an analysis and classification of different mathematical operations and algorithms commonly used in Computer Vision are carried out, as well as a in-depth review of the image processing capabilities of current-generation hardware devices. This permits to determine the requirements and the key aspects for an efficient architecture. A representative set of algorithms is employed as benchmark to evaluate the proposed architecture, which is implemented on an FPGA-based system-on-chip. Finally, the prototype is compared to other related approaches in order to determine its advantages and weaknesses

    Towards a Common Software/Hardware Methodology for Future Advanced Driver Assistance Systems

    Get PDF
    The European research project DESERVE (DEvelopment platform for Safe and Efficient dRiVE, 2012-2015) had the aim of designing and developing a platform tool to cope with the continuously increasing complexity and the simultaneous need to reduce cost for future embedded Advanced Driver Assistance Systems (ADAS). For this purpose, the DESERVE platform profits from cross-domain software reuse, standardization of automotive software component interfaces, and easy but safety-compliant integration of heterogeneous modules. This enables the development of a new generation of ADAS applications, which challengingly combine different functions, sensors, actuators, hardware platforms, and Human Machine Interfaces (HMI). This book presents the different results of the DESERVE project concerning the ADAS development platform, test case functions, and validation and evaluation of different approaches. The reader is invited to substantiate the content of this book with the deliverables published during the DESERVE project. Technical topics discussed in this book include:Modern ADAS development platforms;Design space exploration;Driving modelling;Video-based and Radar-based ADAS functions;HMI for ADAS;Vehicle-hardware-in-the-loop validation system

    A high-performance hardware architecture of an image matching system based on the optimised SIFT algorithm

    Get PDF
    The Scale Invariant Feature Transform (SIFT) is one of the most popular matching algorithms in the field of computer vision. It takes over many other algorithms because features detected are fully invariant to image scaling and rotation, and are also shown to be robust to changes in 3D viewpoint, addition of noise, changes in illumination and a sustainable range of affine distortion. However, the computational complexity is high, which prevents it from achieving real-time. The aim of this project, therefore, is to develop a high-performance image matching system based on the optimised SIFT algorithm to perform real-time feature detection, description and matching. This thesis presents the stages of the development of the system. To reduce the computational complexity, an alternative to the grid layout of standard SIFT is proposed, which is termed as SRI-DASIY (Scale and Rotation Invariant DAISY). The SRI-DAISY achieves comparable performance with the standard SIFT descriptor, but is more efficient to be implemented using hardware, in terms of both computational complexity and memory usage. The design takes only 7.57 µs to generate a descriptor with a system frequency of 100 MHz, which is equivalent to approximately 132,100 descriptors per second and is of the highest throughput when compared with existing designs. Besides, a novel keypoint matching strategy is also presented in this thesis, which achieves higher precision than the widely applied distance ratio based matching and is computationally more efficient. All phases of the SIFT algorithm have been investigated, including feature detection, descriptor generation and descriptor matching. The characterisation of each individual part of the design is carried out and compared with the software simulation results. A fully stand-alone image matching system has been developed that consists of a CMOS camera front-end for image capture, a SIFT processing core embedded in a Field Programmable Logic Array (FPGA) device, and a USB back-end for data transfer. Experiments are conducted by using real-world images to verify the system performance. The system has been tested by integrating into two practical applications. The resulting image matching system eliminates the bottlenecks that limit the overall throughput of the system, and hence allowing the system to process images in real-time without interruption. The design can be modified to adapt to the applications processing images with higher resolution and is still able to achieve real-time

    Constructing the space of visual attention

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Architecture, 2012.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Page 180 blank. Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 168-171).This thesis explores the nature of a human experience in space through a primary inquiry into vision. This inquiry begins by questioning the existing methods and instruments employed to capture and represent a human experience of space. While existing qualitative and quantitative methods and instruments -- from "subjective" interviews to "objective" photographic documentation -- may lead to insight in the study of a human experience in space, we argue that they are inherently limited with respect to physiological realities. As one moves about the world, one believes to see the world as continuous and fully resolved. However, this is not how human vision is currently understood to function on a physiological level. If we want to understand how humans visually construct a space, then we must examine patterns of visual attention on a physiological level. In order to inquire into patterns of visual attention in three dimensional space, we need to develop new instruments and new methods of representation. The instruments we require, directly address the physiological realities of vision, and the methods of representation seek to situate the human subject within a space of their own construction. In order to achieve this goal we have developed PUPIL, a custom set of hardware and software instruments, that capture the subject's eye movements. Using PUPIL, we have conducted a series of trials from proof of concept -- demonstrating the capabilities of our instruments -- to critical inquiry of the relationship between a human subject and a space. We have developed software to visualize this unique spatial experience, and have posed open questions based on the initial findings of our trials. This thesis aims to contribute to spatial design disciplines, by providing a new way to capture and represent a human experience of space.by Moritz Philipp Kassner [and] William Rhoades Patera.S.M
    corecore