    Propuesta de arquitectura y circuitos para la mejora del rango dinámico de sistemas de visión en un chip diseñados en tecnologías CMOS profundamente submicrométrica

    El trabajo presentado en esta tesis trata de proponer nuevas técnicas para la expansión del rango dinámico en sensores electrónicos de imagen. En este caso, hemos dirigido nuestros estudios hacia la posibilidad de proveer dicha funcionalidad en un solo chip. Esto es, sin necesitar ningún soporte externo de hardware o software, formando un tipo de sistema denominado Sistema de Visión en un Chip (VSoC). El rango dinámico de los sensores electrónicos de imagen se define como el cociente entre la máxima y la mínima iluminación medible. Para mejorar este factor surgen dos opciones. La primera, reducir la mínima luz medible mediante la disminución del ruido en el sensor de imagen. La segunda, incrementar la máxima luz medible mediante la extensión del límite de saturación del sensor. Cronológicamente, nuestra primera opción para mejorar el rango dinámico se basó en reducir el ruido. Varias opciones se pueden tomar para mejorar la figura de mérito de ruido del sistema: reducir el ruido usando una tecnología CIS o usar circuitos dedicados, tales como calibración o auto cero. Sin embargo, el uso de técnicas de circuitos implica limitaciones, las cuales sólo pueden ser resueltas mediante el uso de tecnologías no estándar que están especialmente diseñadas para este propósito. La tecnología CIS utilizada está dirigida a la mejora de la calidad y las posibilidades del proceso de fotosensado, tales como sensibilidad, ruido, permitir imagen a color, etcétera. Para estudiar las características de la tecnología en más detalle, se diseñó un chip de test, lo cual permite extraer las mejores opciones para futuros píxeles. No obstante, a pesar de un satisfactorio comportamiento general, las medidas referentes al rango dinámico indicaron que la mejora de este mediante sólo tecnología CIS es muy limitada. Es decir, la mejora de la corriente oscura del sensor no es suficiente para nuestro propósito. Para una mayor mejora del rango dinámico se deben incluir circuitos dentro del píxel. No obstante, las tecnologías CIS usualmente no permiten nada más que transistores NMOS al lado del fotosensor, lo cual implica una seria restricción en el circuito a usar. Como resultado, el diseño de un sensor de imagen con mejora del rango dinámico en tecnologías CIS fue desestimado en favor del uso de una tecnología estándar, la cual da más flexibilidad al diseño del píxel. En tecnologías estándar, es posible introducir una alta funcionalidad usando circuitos dentro del píxel, lo cual permite técnicas avanzadas para extender el límite de saturación de los sensores de imagen. Para este objetivo surgen dos opciones: adquisición lineal o compresiva. Si se realiza una adquisición lineal, se generarán una gran cantidad de datos por cada píxel. Como ejemplo, si el rango dinámico de la escena es de 120dB al menos se necesitarían 20-bits/píxel, log2(10120/20)=19.93, para la representación binaria de este rango dinámico. Esto necesitaría de amplios recursos para procesar esta gran cantidad de datos, y un gran ancho de banda para moverlos al circuito de procesamiento. Para evitar estos problemas, los sensores de imagen de alto rango dinámico usualmente optan por utilizar una adquisición compresiva de la luz. Por lo tanto, esto implica dos tareas a realizar: la captura y la compresión de la imagen. La captura de la imagen se realiza a nivel de píxel, en el dispositivo fotosensor, mientras que la compresión de la imagen puede ser realizada a nivel de píxel, de sistema, o mediante postprocesado externo. Usando el postprocesado, existe un campo de investigación que estudia la compresión de escenas de alto rango dinámico mientras se mantienen los detalles, produciendo un resultado apropiado para la percepción humana en monitores convencionales de bajo rango dinámico. Esto se denomina Mapeo de Tonos (Tone Mapping) y usualmente emplea solo 8-bits/píxel para las representaciones de imágenes, ya que éste es el estándar para las imágenes de bajo rango dinámico. Los píxeles de adquisición compresiva, por su parte, realizan una compresión que no es dependiente de la escena de alto rango dinámico a capturar, lo cual implica una baja compresión o pérdida de detalles y contraste. Para evitar estas desventajas, en este trabajo, se presenta un píxel de adquisición compresiva que aplica una técnica de mapeo de tonos que permite la captura de imágenes ya comprimidas de una forma optimizada para mantener los detalles y el contraste, produciendo una cantidad muy reducida de datos. Las técnicas de mapeo de tonos ejecutan normalmente postprocesamiento mediante software en un ordenador sobre imágenes capturadas sin compresión, las cuales contienen una gran cantidad de datos. Estas técnicas han pertenecido tradicionalmente al campo de los gráficos por ordenador debido a la gran cantidad de esfuerzo computacional que requieren. Sin embargo, hemos desarrollado un nuevo algoritmo de mapeo de tonos especialmente adaptado para aprovechar los circuitos dentro del píxel y que requiere un reducido esfuerzo de computación fuera de la matriz de píxeles, lo cual permite el desarrollo de un sistema de visión en un solo chip. El nuevo algoritmo de mapeo de tonos, el cual es un concepto matemático que puede ser simulado mediante software, se ha implementado también en un chip. Sin embargo, para esta implementación hardware en un chip son necesarias algunas adaptaciones y técnicas avanzadas de diseño, que constituyen en sí mismas otra de las contribuciones de este trabajo.     A Review on Digital Pixel Sensors

    Digital pixel sensor (DPS) has evolved as a pivotal component in modern imaging systems and has the potential to revolutionize various fields such as medical imaging, astronomy, surveillance, IoT devices, etc. Compared to analog pixel sensors, the DPS offers high speed and good image quality. However, the introduced intrinsic complexity within each pixel, primarily attributed to the accommodation of the ADC circuit, engenders a substantial increase in the pixel pitch. Unfortunately, such a pronounced escalation in pixel pitch drastically undermines the feasibility of achieving high-density integration, which is an obstacle that significantly narrows down the field of potential applications. Nonetheless, designing compact conversion circuits along with strategic integration of 3D architectural paradigms can be a potential remedy to the prevailing situation. This review article presents a comprehensive overview of the vast area of DPS technology. The operating principles, advantages, and challenges of different types of DPS circuits have been analyzed. We categorize the schemes into several categories based on ADC operation. A comparative study based on different performance metrics has also been showcased for a well-rounded understanding

    Low Latency Displays for Augmented Reality

    The primary goal for Augmented Reality (AR) is bringing the real and virtual together into a common space. Maintaining this illusion, however, requires preserving spatially and temporally consistent registration despite changes in user or object pose. The greatest source of registration error is latency—the delay between when something moves and the display changes in response—which breaks temporal consistency. Furthermore, the real world varies greatly in brightness; ranging from bright sunlight to deep shadows. Thus, a compelling AR system must also support High-Dynamic Range (HDR) to maintain its virtual objects’ appearance both spatially and temporally consistent with the real world. This dissertation presents new methods, implementations, results (both visual and performance), and future steps for low latency displays, primarily in the context of Optical See-through Augmented Reality (OST-AR) Head-Mounted Displays, focusing on temporal consistency in registration, HDR color support, and spatial and temporal consistency in brightness: 1. For registration temporal consistency, the primary insight is breaking the conventional display paradigm: computers render imagery, frame by frame, and then transmit it to the display for emission. Instead, the display must also contribute towards rendering by performing a post-rendering, post-transmission warp of the computer-supplied imagery in the display hardware. By compensating in the display for system latency by using the latest tracking information, much of the latency can be short-circuited. Furthermore, the low latency display must support ultra-high frequency (multiple kHz) refreshing to minimize pose displacement between updates. 2. For HDR color support, the primary insight is developing new display modulation techniques. DMDs, a type of ultra-high frequency display, emit binary output, which require modulation to produce multiple brightness levels. Conventional modulation breaks low latency guarantees, and modulation of bright LEDs illuminators at frequencies to support kHz-order HDR exceeds their capabilities. Thus one must directly synthesize the necessary variation in brightness. 3. For spatial and temporal brightness consistency, the primary insight is integrating HDR light sensors into the display hardware: the same processes which both compensate for latency and generate HDR output can also modify it in response to the spatially sensed brightness of the real world.Doctor of Philosoph

    Panoramic, large-screen, 3-D flight display system design

    The report documents and summarizes the results of the required evaluations specified in the SOW and the design specifications for the selected display system hardware. Also included are the proposed development plan and schedule as well as the estimated rough order of magnitude (ROM) cost to design, fabricate, and demonstrate a flyable prototype research flight display system. The thrust of the effort was development of a complete understanding of the user/system requirements for a panoramic, collimated, 3-D flyable avionic display system and the translation of the requirements into an acceptable system design for fabrication and demonstration of a prototype display in the early 1997 time frame. Eleven display system design concepts were presented to NASA LaRC during the program, one of which was down-selected to a preferred display system concept. A set of preliminary display requirements was formulated. The state of the art in image source technology, 3-D methods, collimation methods, and interaction methods for a panoramic, 3-D flight display system were reviewed in depth and evaluated. Display technology improvements and risk reductions associated with maturity of the technologies for the preferred display system design concept were identified

    Design of a High-Speed Architecture for Stabilization of Video Captured Under Non-Uniform Lighting Conditions

    Video captured in shaky conditions may lead to vibrations. A robust algorithm to immobilize the video by compensating for the vibrations from physical settings of the camera is presented in this dissertation. A very high performance hardware architecture on Field Programmable Gate Array (FPGA) technology is also developed for the implementation of the stabilization system. Stabilization of video sequences captured under non-uniform lighting conditions begins with a nonlinear enhancement process. This improves the visibility of the scene captured from physical sensing devices which have limited dynamic range. This physical limitation causes the saturated region of the image to shadow out the rest of the scene. It is therefore desirable to bring back a more uniform scene which eliminates the shadows to a certain extent. Stabilization of video requires the estimation of global motion parameters. By obtaining reliable background motion, the video can be spatially transformed to the reference sequence thereby eliminating the unintended motion of the camera. A reflectance-illuminance model for video enhancement is used in this research work to improve the visibility and quality of the scene. With fast color space conversion, the computational complexity is reduced to a minimum. The basic video stabilization model is formulated and configured for hardware implementation. Such a model involves evaluation of reliable features for tracking, motion estimation, and affine transformation to map the display coordinates of a stabilized sequence. The multiplications, divisions and exponentiations are replaced by simple arithmetic and logic operations using improved log-domain computations in the hardware modules. On Xilinx\u27s Virtex II 2V8000-5 FPGA platform, the prototype system consumes 59% logic slices, 30% flip-flops, 34% lookup tables, 35% embedded RAMs and two ZBT frame buffers. The system is capable of rendering 180.9 million pixels per second (mpps) and consumes approximately 30.6 watts of power at 1.5 volts. With a 1024×1024 frame, the throughput is equivalent to 172 frames per second (fps). Future work will optimize the performance-resource trade-off to meet the specific needs of the applications. It further extends the model for extraction and tracking of moving objects as our model inherently encapsulates the attributes of spatial distortion and motion prediction to reduce complexity. With these parameters to narrow down the processing range, it is possible to achieve a minimum of 20 fps on desktop computers with Intel Core 2 Duo or Quad Core CPUs and 2GB DDR2 memory without a dedicated hardware

    A review of snapshot multidimensional optical imaging: Measuring photon tags in parallel

    Multidimensional optical imaging has seen remarkable growth in the past decade. Rather than measuring only the two-dimensional spatial distribution of light, as in conventional photography, multidimensional optical imaging captures light in up to nine dimensions, providing unprecedented information about incident photons’ spatial coordinates, emittance angles, wavelength, time, and polarization. Multidimensional optical imaging can be accomplished either by scanning or parallel acquisition. Compared with scanning-based imagers, parallel acquisition–also dubbed snapshot imaging–has a prominent advantage in maximizing optical throughput, particularly when measuring a datacube of high dimensions. Here, we first categorize snapshot multidimensional imagers based on their acquisition and image reconstruction strategies, then highlight the snapshot advantage in the context of optical throughput, and finally we discuss their state-of-the-art implementations and applications

    Earth imaging with microsatellites: An investigation, design, implementation and in-orbit demonstration of electronic imaging systems for earth observation on-board low-cost microsatellites.

    This research programme has studied the possibilities and difficulties of using 50 kg microsatellites to perform remote imaging of the Earth. The design constraints of these missions are quite different to those encountered in larger, conventional spacecraft. While the main attractions of microsatellites are low cost and fast response times, they present the following key limitations: Payload mass under 5 kg, Continuous payload power under 5 Watts, peak power up to 15 Watts, Narrow communications bandwidths (9.6 / 38.4 kbps), Attitude control to within 5°, No moving mechanics. The most significant factor is the limited attitude stability. Without sub-degree attitude control, conventional scanning imaging systems cannot preserve scene geometry, and are therefore poorly suited to current microsatellite capabilities. The foremost conclusion of this thesis is that electronic cameras, which capture entire scenes in a single operation, must be used to overcome the effects of the satellite's motion. The potential applications of electronic cameras, including microsatellite remote sensing, have erupted with the recent availability of high sensitivity field-array CCD (charge-coupled device) image sensors. The research programme has established suitable techniques and architectures necessary for CCD sensors, cameras and entire imaging systems to fulfil scientific/commercial remote sensing despite the difficult conditions on microsatellites. The author has refined these theories by designing, building and exploiting in-orbit five generations of electronic cameras. The major objective of meteorological scale imaging was conclusively demonstrated by the Earth imaging camera flown on the UoSAT-5 spacecraft in 1991. Improved cameras have since been carried by the KITSAT-1 (1992) and PoSAT-1 (1993) microsatellites. PoSAT-1 also flies a medium resolution camera (200 metres) which (despite complete success) has highlighted certain limitations of microsatellites for high resolution remote sensing. A reworked, and extensively modularised, design has been developed for the four camera systems deployed on the FASat-Alfa mission (1995). Based on the success of these missions, this thesis presents many recommendations for the design of microsatellite imaging systems. The novelty of this research programme has been the principle of designing practical camera systems to fit on an existing, highly restrictive, satellite platform, rather than conceiving a fictitious small satellite to support a high performance scanning imager. This pragmatic approach has resulted in the first incontestable demonstrations of the feasibility of remote sensing of the Earth from inexpensive microsatellites

    Non-contact vision-based deformation monitoring on bridge structures

    Information on deformation is an important metric for bridge condition and performance assessment, e.g. identifying abnormal events, calibrating bridge models and estimating load carrying capacities, etc. However, accurate measurement of bridge deformation, especially for long-span bridges remains as a challenging task. The major aim of this research is to develop practical and cost-effective techniques for accurate deformation monitoring on bridge structures. Vision-based systems are taken as the study focus due to a few reasons: low cost, easy installation, desired sample rates, remote and distributed sensing, etc. This research proposes an custom-developed vision-based system for bridge deformation monitoring. The system supports either consumer-grade or professional cameras and incorporates four advanced video tracking methods to adapt to different test situations. The sensing accuracy is firstly quantified in laboratory conditions. The working performance in field testing is evaluated on one short-span and one long-span bridge examples considering several influential factors i.e. long-range sensing, low-contrast target patterns, pattern changes and lighting changes. Through case studies, some suggestions about tracking method selection are summarised for field testing. Possible limitations of vision-based systems are illustrated as well. To overcome observed limitations of vision-based systems, this research further proposes a mixed system combining cameras with accelerometers for accurate deformation measurement. To integrate displacement with acceleration data autonomously, a novel data fusion method based on Kalman filter and maximum likelihood estimation is proposed. Through field test validation, the method is effective for improving displacement accuracy and widening frequency bandwidth. The mixed system based on data fusion is implemented on field testing of a railway bridge considering undesired test conditions (e.g. low-contrast target patterns and camera shake). Analysis results indicate that the system offers higher accuracy than using a camera alone and is viable for bridge influence line estimation. With considerable accuracy and resolution in time and frequency domains, the potential of vision-based measurement for vibration monitoring is investigated. The proposed vision-based system is applied on a cable-stayed footbridge for deck deformation and cable vibration measurement under pedestrian loading. Analysis results indicate that the measured data enables accurate estimation of modal frequencies and could be used to investigate variations of modal frequencies under varying pedestrian loads. The vision-based system in this application is used for multi-point vibration measurement and provides results comparable to those obtained using an array of accelerometers

    Designing a New Tactile Display Technology and its Disability Interactions

    People with visual impairments have a strong desire for a refreshable tactile interface that can provide immediate access to full page of Braille and tactile graphics. Regrettably, existing devices come at a considerable expense and remain out of reach for many. The exorbitant costs associated with current tactile displays stem from their intricate design and the multitude of components needed for their construction. This underscores the pressing need for technological innovation that can enhance tactile displays, making them more accessible and available to individuals with visual impairments. This research thesis delves into the development of a novel tactile display technology known as Tacilia. This technology's necessity and prerequisites are informed by in-depth qualitative engagements with students who have visual impairments, alongside a systematic analysis of the prevailing architectures underpinning existing tactile display technologies. The evolution of Tacilia unfolds through iterative processes encompassing conceptualisation, prototyping, and evaluation. With Tacilia, three distinct products and interactive experiences are explored, empowering individuals to manually draw tactile graphics, generate digitally designed media through printing, and display these creations on a dynamic pin array display. This innovation underscores Tacilia's capability to streamline the creation of refreshable tactile displays, rendering them more fitting, usable, and economically viable for people with visual impairments