447 research outputs found

    Propuesta de arquitectura y circuitos para la mejora del rango dinámico de sistemas de visión en un chip diseñados en tecnologías CMOS profundamente submicrométrica

    Get PDF
    El trabajo presentado en esta tesis trata de proponer nuevas técnicas para la expansión del rango dinámico en sensores electrónicos de imagen. En este caso, hemos dirigido nuestros estudios hacia la posibilidad de proveer dicha funcionalidad en un solo chip. Esto es, sin necesitar ningún soporte externo de hardware o software, formando un tipo de sistema denominado Sistema de Visión en un Chip (VSoC). El rango dinámico de los sensores electrónicos de imagen se define como el cociente entre la máxima y la mínima iluminación medible. Para mejorar este factor surgen dos opciones. La primera, reducir la mínima luz medible mediante la disminución del ruido en el sensor de imagen. La segunda, incrementar la máxima luz medible mediante la extensión del límite de saturación del sensor. Cronológicamente, nuestra primera opción para mejorar el rango dinámico se basó en reducir el ruido. Varias opciones se pueden tomar para mejorar la figura de mérito de ruido del sistema: reducir el ruido usando una tecnología CIS o usar circuitos dedicados, tales como calibración o auto cero. Sin embargo, el uso de técnicas de circuitos implica limitaciones, las cuales sólo pueden ser resueltas mediante el uso de tecnologías no estándar que están especialmente diseñadas para este propósito. La tecnología CIS utilizada está dirigida a la mejora de la calidad y las posibilidades del proceso de fotosensado, tales como sensibilidad, ruido, permitir imagen a color, etcétera. Para estudiar las características de la tecnología en más detalle, se diseñó un chip de test, lo cual permite extraer las mejores opciones para futuros píxeles. No obstante, a pesar de un satisfactorio comportamiento general, las medidas referentes al rango dinámico indicaron que la mejora de este mediante sólo tecnología CIS es muy limitada. Es decir, la mejora de la corriente oscura del sensor no es suficiente para nuestro propósito. Para una mayor mejora del rango dinámico se deben incluir circuitos dentro del píxel. No obstante, las tecnologías CIS usualmente no permiten nada más que transistores NMOS al lado del fotosensor, lo cual implica una seria restricción en el circuito a usar. Como resultado, el diseño de un sensor de imagen con mejora del rango dinámico en tecnologías CIS fue desestimado en favor del uso de una tecnología estándar, la cual da más flexibilidad al diseño del píxel. En tecnologías estándar, es posible introducir una alta funcionalidad usando circuitos dentro del píxel, lo cual permite técnicas avanzadas para extender el límite de saturación de los sensores de imagen. Para este objetivo surgen dos opciones: adquisición lineal o compresiva. Si se realiza una adquisición lineal, se generarán una gran cantidad de datos por cada píxel. Como ejemplo, si el rango dinámico de la escena es de 120dB al menos se necesitarían 20-bits/píxel, log2(10120/20)=19.93, para la representación binaria de este rango dinámico. Esto necesitaría de amplios recursos para procesar esta gran cantidad de datos, y un gran ancho de banda para moverlos al circuito de procesamiento. Para evitar estos problemas, los sensores de imagen de alto rango dinámico usualmente optan por utilizar una adquisición compresiva de la luz. Por lo tanto, esto implica dos tareas a realizar: la captura y la compresión de la imagen. La captura de la imagen se realiza a nivel de píxel, en el dispositivo fotosensor, mientras que la compresión de la imagen puede ser realizada a nivel de píxel, de sistema, o mediante postprocesado externo. Usando el postprocesado, existe un campo de investigación que estudia la compresión de escenas de alto rango dinámico mientras se mantienen los detalles, produciendo un resultado apropiado para la percepción humana en monitores convencionales de bajo rango dinámico. Esto se denomina Mapeo de Tonos (Tone Mapping) y usualmente emplea solo 8-bits/píxel para las representaciones de imágenes, ya que éste es el estándar para las imágenes de bajo rango dinámico. Los píxeles de adquisición compresiva, por su parte, realizan una compresión que no es dependiente de la escena de alto rango dinámico a capturar, lo cual implica una baja compresión o pérdida de detalles y contraste. Para evitar estas desventajas, en este trabajo, se presenta un píxel de adquisición compresiva que aplica una técnica de mapeo de tonos que permite la captura de imágenes ya comprimidas de una forma optimizada para mantener los detalles y el contraste, produciendo una cantidad muy reducida de datos. Las técnicas de mapeo de tonos ejecutan normalmente postprocesamiento mediante software en un ordenador sobre imágenes capturadas sin compresión, las cuales contienen una gran cantidad de datos. Estas técnicas han pertenecido tradicionalmente al campo de los gráficos por ordenador debido a la gran cantidad de esfuerzo computacional que requieren. Sin embargo, hemos desarrollado un nuevo algoritmo de mapeo de tonos especialmente adaptado para aprovechar los circuitos dentro del píxel y que requiere un reducido esfuerzo de computación fuera de la matriz de píxeles, lo cual permite el desarrollo de un sistema de visión en un solo chip. El nuevo algoritmo de mapeo de tonos, el cual es un concepto matemático que puede ser simulado mediante software, se ha implementado también en un chip. Sin embargo, para esta implementación hardware en un chip son necesarias algunas adaptaciones y técnicas avanzadas de diseño, que constituyen en sí mismas otra de las contribuciones de este trabajo. Más aún, debido a la nueva funcionalidad, se han desarrollado modificaciones de los típicos métodos a usar para la caracterización y captura de imágenes

    RRR-robot : instruction manual

    Get PDF

    RRR-robot : instruction manual

    Get PDF

    Computational strategies for understanding underwater optical image datasets

    Get PDF
    Thesis: Ph. D. in Mechanical and Oceanographic Engineering, Joint Program in Oceanography/Applied Ocean Science and Engineering (Massachusetts Institute of Technology, Department of Mechanical Engineering; and the Woods Hole Oceanographic Institution), 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 117-135).A fundamental problem in autonomous underwater robotics is the high latency between the capture of image data and the time at which operators are able to gain a visual understanding of the survey environment. Typical missions can generate imagery at rates hundreds of times greater than highly compressed images can be transmitted acoustically, delaying that understanding until after the vehicle has been recovered and the data analyzed. While automated classification algorithms can lessen the burden on human annotators after a mission, most are too computationally expensive or lack the robustness to run in situ on a vehicle. Fast algorithms designed for mission-time performance could lessen the latency of understanding by producing low-bandwidth semantic maps of the survey area that can then be telemetered back to operators during a mission. This thesis presents a lightweight framework for processing imagery in real time aboard a robotic vehicle. We begin with a review of pre-processing techniques for correcting illumination and attenuation artifacts in underwater images, presenting our own approach based on multi-sensor fusion and a strong physical model. Next, we construct a novel image pyramid structure that can reduce the complexity necessary to compute features across multiple scales by an order of magnitude and recommend features which are fast to compute and invariant to underwater artifacts. Finally, we implement our framework on real underwater datasets and demonstrate how it can be used to select summary images for the purpose of creating low-bandwidth semantic maps capable of being transmitted acoustically.by Jeffrey W. Kaeli.Ph. D. in Mechanical and Oceanographic Engineerin

    A Multi-Band Far-Infrared Survey with a Balloon-Borne Telescope

    Get PDF
    Nine additional radiation sources, above a 3-sigma confidence level of 1300 Jy, were identified at 100 microns by far infrared photometry of the galactic plane using a 0.4 meter aperture, liquid helium cooled, multichannel far infrared balloon-borne telescope. The instrument is described, including its electronics, pointing and suspension systems, and ground support equipment. Testing procedures and flight staging are discussed along with the reduction and analysis of the data acquired. The history of infrared astronomy is reviewed. General infrared techniques and the concerns of balloon astronomers are explored

    Three-Dimensional Geometry Inference of Convex and Non-Convex Rooms using Spatial Room Impulse Responses

    Get PDF
    This thesis presents research focused on the problem of geometry inference for both convex- and non-convex-shaped rooms, through the analysis of spatial room impulse responses. Current geometry inference methods are only applicable to convex-shaped rooms, requiring between 6--78 discretely spaced measurement positions, and are only accurate under certain conditions, such as a first-order reflection for each boundary being identifiable across all, or some subset of, these measurements. This thesis proposes that by using compact microphone arrays capable of capturing spatiotemporal information, boundary locations, and hence room shape for both convex and non-convex cases, can be inferred, using only a sufficient number of measurement positions to ensure each boundary has a first-order reflection attributable to, and identifiable in, at least one measurement. To support this, three research areas are explored. Firstly, the accuracy of direction-of-arrival estimation for reflections in binaural room impulse responses is explored, using a state-of-the-art methodology based on binaural model fronted neural networks. This establishes whether a two-microphone array can produce accurate enough direction-of-arrival estimates for geometry inference. Secondly, a spherical microphone array based spatiotemporal decomposition workflow for analysing reflections in room impulse responses is explored. This establishes that simultaneously arriving reflections can be individually detected, relaxing constraints on measurement positions. Finally, a geometry inference method applicable to both convex and more complex non-convex shaped rooms is proposed. Therefore, this research expands the possible scenarios in which geometry inference can be successfully applied at a level of accuracy comparable to existing work, through the use of commonly used compact microphone arrays. Based on these results, future improvements to this approach are presented and discussed in detail

    Features for matching people in different views

    No full text
    There have been significant advances in the computer vision field during the last decade. During this period, many methods have been developed that have been successful in solving challenging problems including Face Detection, Object Recognition and 3D Scene Reconstruction. The solutions developed by computer vision researchers have been widely adopted and used in many real-life applications such as those faced in the medical and security industry. Among the different branches of computer vision, Object Recognition has been an area that has advanced rapidly in recent years. The successful introduction of approaches such as feature extraction and description has been an important factor in the growth of this area. In recent years, researchers have attempted to use these approaches and apply them to other problems such as Content Based Image Retrieval and Tracking. In this work, we present a novel system that finds correspondences between people seen in different images. Unlike other approaches that rely on a video stream to track the movement of people between images, here we present a feature-based approach where we locate a target’s new location in an image, based only on its visual appearance. Our proposed system comprises three steps. In the first step, a set of features is extracted from the target’s appearance. A novel algorithm is developed that allows extraction of features from a target that is particularly suitable to the modelling task. In the second step, each feature is characterised using a combined colour and texture descriptor. Inclusion of information relating to both colour and texture of a feature add to the descriptor’s distinctiveness. Finally, the target’s appearance and pose is modelled as a collection of such features and descriptors. This collection is then used as a template that allows us to search for a similar combination of features in other images that correspond to the target’s new location. We have demonstrated the effectiveness of our system in locating a target’s new position in an image, despite differences in viewpoint, scale or elapsed time between the images. The characterisation of a target as a collection of features also allows our system to robustly deal with the partial occlusion of the target

    Unconstrained Road Sign Recognition

    Get PDF
    There are many types of road signs, each of which carries a different meaning and function: some signs regulate traffic, others indicate the state of the road or guide and warn drivers and pedestrians. Existent image-based road sign recognition systems work well under ideal conditions, but experience problems when the lighting conditions are poor or the signs are partially occluded. The aim of this research is to propose techniques to recognize road signs in a real outdoor environment, especially to deal with poor lighting and partially occluded road signs. To achieve this, hybrid segmentation and classification algorithms are proposed. In the first part of the thesis, we propose a hybrid dynamic threshold colour segmentation algorithm based on histogram analysis. A dynamic threshold is very important in road sign segmentation, since road sign colours may change throughout the day due to environmental conditions. In the second part, we propose a geometrical shape symmetry detection and reconstruction algorithm to detect and reconstruct the shape of the sign when it is partially occluded. This algorithm is robust to scale changes and rotations. The last part of this thesis deals with feature extraction and classification. We propose a hybrid feature vector based on histograms of oriented gradients, local binary patterns, and the scale-invariant feature transform. This vector is fed into a classifier that combines a Support Vector Machine (SVM) using a Random Forest and a hybrid SVM k-Nearest Neighbours (kNN) classifier. The overall method proposed in this thesis shows a high accuracy rate of 99.4% in ideal conditions, 98.6% in noisy and fading conditions, 98.4% in poor lighting conditions, and 92.5% for partially occluded road signs on the GRAMUAH traffic signs dataset

    An artifacts removal post-processing for epiphyseal region-of-interest (EROI) localization in automated bone age assessment (BAA)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Segmentation is the most crucial part in the computer-aided bone age assessment. A well-known type of segmentation performed in the system is adaptive segmentation. While providing better result than global thresholding method, the adaptive segmentation produces a lot of unwanted noise that could affect the latter process of epiphysis extraction.</p> <p>Methods</p> <p>A proposed method with anisotropic diffusion as pre-processing and a novel Bounded Area Elimination (BAE) post-processing algorithm to improve the algorithm of ossification site localization technique are designed with the intent of improving the adaptive segmentation result and the region-of interest (ROI) localization accuracy.</p> <p>Results</p> <p>The results are then evaluated by quantitative analysis and qualitative analysis using texture feature evaluation. The result indicates that the image homogeneity after anisotropic diffusion has improved averagely on each age group for 17.59%. Results of experiments showed that the smoothness has been improved averagely 35% after BAE algorithm and the improvement of ROI localization has improved for averagely 8.19%. The MSSIM has improved averagely 10.49% after performing the BAE algorithm on the adaptive segmented hand radiograph.</p> <p>Conclusions</p> <p>The result indicated that hand radiographs which have undergone anisotropic diffusion have greatly reduced the noise in the segmented image and the result as well indicated that the BAE algorithm proposed is capable of removing the artifacts generated in adaptive segmentation.</p

    Development of new intelligent autonomous robotic assistant for hospitals

    Get PDF
    Continuous technological development in modern societies has increased the quality of life and average life-span of people. This imposes an extra burden on the current healthcare infrastructure, which also creates the opportunity for developing new, autonomous, assistive robots to help alleviate this extra workload. The research question explored the extent to which a prototypical robotic platform can be created and how it may be implemented in a hospital environment with the aim to assist the hospital staff with daily tasks, such as guiding patients and visitors, following patients to ensure safety, and making deliveries to and from rooms and workstations. In terms of major contributions, this thesis outlines five domains of the development of an actual robotic assistant prototype. Firstly, a comprehensive schematic design is presented in which mechanical, electrical, motor control and kinematics solutions have been examined in detail. Next, a new method has been proposed for assessing the intrinsic properties of different flooring-types using machine learning to classify mechanical vibrations. Thirdly, the technical challenge of enabling the robot to simultaneously map and localise itself in a dynamic environment has been addressed, whereby leg detection is introduced to ensure that, whilst mapping, the robot is able to distinguish between people and the background. The fourth contribution is geometric collision prediction into stabilised dynamic navigation methods, thus optimising the navigation ability to update real-time path planning in a dynamic environment. Lastly, the problem of detecting gaze at long distances has been addressed by means of a new eye-tracking hardware solution which combines infra-red eye tracking and depth sensing. The research serves both to provide a template for the development of comprehensive mobile assistive-robot solutions, and to address some of the inherent challenges currently present in introducing autonomous assistive robots in hospital environments.Open Acces
    corecore