1,248 research outputs found

    Real-Time Algorithms for High Dynamic Range Video

    Full text link
    A recurring problem in capturing video is the scene having a range of brightness values that exceeds the capabilities of the capturing device. An example would be a video camera in a bright outside area, directed at the entrance of a building. Because of the potentially big brightness difference, it may not be possible to capture details of the inside of the building and the outside simultaneously using just one shutter speed setting. This results in under- and overexposed pixels in the video footage. The approach we follow in this thesis to overcome this problem is temporal exposure bracketing, i.e., using a set of images captured in quick sequence at different shutter settings. Each image then captures one facet of the scene's brightness range. When fused together, a high dynamic range (HDR) video frame is created that reveals details in dark and bright regions simultaneously. The process of creating a frame in an HDR video can be thought of as a pipeline where the output of each step is the input to the subsequent one. It begins by capturing a set of regular images using varying shutter speeds. Next, the images are aligned with respect to each other to compensate for camera and scene motion during capture. The aligned images are then merged together to create a single HDR frame containing accurate brightness values of the entire scene. As a last step, the HDR frame is tone mapped in order to be displayable on a regular screen with a lower dynamic range. This thesis covers algorithms for these steps that allow the creation of HDR video in real-time. When creating videos instead of still images, the focus lies on high capturing and processing speed and on assuring temporal consistency between the video frames. In order to achieve this goal, we take advantage of the knowledge gained from the processing of previous frames in the video. This work addresses the following aspects in particular. The image size parameters for the set of base images are chosen such that only as little image data as possible is captured. We make use of the fact that it is not always necessary to capture full size images when only small portions of the scene require HDR. Avoiding redundancy in the image material is an obvious approach to reducing the overall time taken to generate a frame. With the aid of the previous frames, we calculate brightness statistics of the scene. The exposure values are chosen in a way, such that frequently occurring brightness values are well-exposed in at least one of the images in the sequence. The base images from which the HDR frame is created are captured in quick succession. The effects of intermediate camera motion are thus less intense than in the still image case, and a comparably simpler camera motion model can be used. At the same time, however, there is much less time available to estimate motion. For this reason, we use a fast heuristic that makes use of the motion information obtained in previous frames. It is robust to the large brightness difference between the images of an exposure sequence. The range of luminance values of an HDR frame must be tone mapped to the displayable range of the output device. Most available tone mapping operators are designed for still images and scale the dynamic range of each frame independently. In situations where the scene's brightness statistics change quickly, these operators produce visible image flicker. We have developed an algorithm that detects such situations in an HDR video. Based on this detection, a temporal stability criterion for the tone mapping parameters then prevents image flicker. All methods for capture, creation and display of HDR video introduced in this work have been fully implemented, tested and integrated into a running HDR video system. The algorithms were analyzed for parallelizability and, if applicable, adjusted and implemented on a high-performance graphics chip

    Large-Scale Textured 3D Scene Reconstruction

    Get PDF
    Die Erstellung dreidimensionaler Umgebungsmodelle ist eine fundamentale Aufgabe im Bereich des maschinellen Sehens. Rekonstruktionen sind für eine Reihe von Anwendungen von Nutzen, wie bei der Vermessung, dem Erhalt von Kulturgütern oder der Erstellung virtueller Welten in der Unterhaltungsindustrie. Im Bereich des automatischen Fahrens helfen sie bei der Bewältigung einer Vielzahl an Herausforderungen. Dazu gehören Lokalisierung, das Annotieren großer Datensätze oder die vollautomatische Erstellung von Simulationsszenarien. Die Herausforderung bei der 3D Rekonstruktion ist die gemeinsame Schätzung von Sensorposen und einem Umgebunsmodell. Redundante und potenziell fehlerbehaftete Messungen verschiedener Sensoren müssen in eine gemeinsame Repräsentation der Welt integriert werden, um ein metrisch und photometrisch korrektes Modell zu erhalten. Gleichzeitig muss die Methode effizient Ressourcen nutzen, um Laufzeiten zu erreichen, welche die praktische Nutzung ermöglichen. In dieser Arbeit stellen wir ein Verfahren zur Rekonstruktion vor, das fähig ist, photorealistische 3D Rekonstruktionen großer Areale zu erstellen, die sich über mehrere Kilometer erstrecken. Entfernungsmessungen aus Laserscannern und Stereokamerasystemen werden zusammen mit Hilfe eines volumetrischen Rekonstruktionsverfahrens fusioniert. Ringschlüsse werden erkannt und als zusätzliche Bedingungen eingebracht, um eine global konsistente Karte zu erhalten. Das resultierende Gitternetz wird aus Kamerabildern texturiert, wobei die einzelnen Beobachtungen mit ihrer Güte gewichtet werden. Für eine nahtlose Erscheinung werden die unbekannten Belichtungszeiten und Parameter des optischen Systems mitgeschätzt und die Bilder entsprechend korrigiert. Wir evaluieren unsere Methode auf synthetischen Daten, realen Sensordaten unseres Versuchsfahrzeugs und öffentlich verfügbaren Datensätzen. Wir zeigen qualitative Ergebnisse großer innerstädtischer Bereiche, sowie quantitative Auswertungen der Fahrzeugtrajektorie und der Rekonstruktionsqualität. Zuletzt präsentieren wir mehrere Anwendungen und zeigen somit den Nutzen unserer Methode für Anwendungen im Bereich des automatischen Fahrens

    Computational Video Enhancement

    Get PDF
    During a video, each scene element is often imaged many times by the sensor. I propose that by combining information from each captured frame throughout the video it is possible to enhance the entire video. This concept is the basis of computational video enhancement. In this dissertation, the viability of computational video processing is explored in addition to presenting applications where this processing method can be leveraged. Spatio-temporal volumes are employed as a framework for efficient computational video processing, and I extend them by introducing sheared volumes. Shearing provides spatial frame warping for alignment between frames, allowing temporally-adjacent samples to be processed using traditional editing and filtering approaches. An efficient filter-graph framework is presented to support this processing along with a prototype video editing and manipulation tool utilizing that framework. To demonstrate the integration of samples from multiple frames, I introduce methods for improving poorly exposed low-light videos to achieve improved results. This integration is guided by a tone-mapping process to determine spatially-varying optimal exposures and an adaptive spatio-temporal filter to integrate the samples. Low-light video enhancement is also addressed in the multispectral domain by combining visible and infrared samples. This is facilitated by the use of a novel multispectral edge-preserving filter to enhance only the visible spectrum video. Finally, the temporal characteristics of videos are altered by a computational video resampling process. By resampling the video-rate footage, novel time-lapse sequences are found that optimize for user-specified characteristics. Each resulting shorter video is a more faithful summary of the original source than a traditional time-lapse video. Simultaneously, new synthetic exposures are generated to alter the output video's aliasing characteristics

    A cost-effective, mobile platform-based, photogrammetric approach for continuous structural deformation monitoring

    Get PDF
    PhD ThesisWith the evolution of construction techniques and materials technology, the design of modern civil engineering infrastructure has become increasingly advanced and complex. In parallel to this, the development and application of appropriate and efficient monitoring technologies has become essential. Improvement in the performance of structural monitoring systems, reduction of labour and total implementation costs have therefore become important issues that scientists and engineers are committed to solving. In this research, a non-intrusive structural monitoring system was developed based on close-range photogrammetric principles. This research aimed to combine the merits of photogrammetry and latest mobile phone technology to propose a cost-effective, compact (portable) and precise solution for structural monitoring applications. By combining the use of low-cost imaging devices (two or more mobile phone handsets) with in-house control software, a monitoring project can be undertaken within a relatively low budget when compared to conventional methods. The system uses programmable smart phones (Google Android v.2.2 OS) to replace conventional in-situ photogrammetric imaging stations. The developed software suite is able to control multiple handsets to continuously capture high-quality, synchronized image sequences for short or long-term structural monitoring purposes. The operations are fully automatic and the system can be remotely controlled, exempting the operator from having to attend the site, and thus saving considerable labour expense in long-term monitoring tasks. In order to prevent the system from crashing during a long-term monitoring scheme, an automatic system state monitoring program and a system recovery module were developed to enhance the stability. In considering that the image resolution for current mobile phone cameras is relatively low (in comparison to contemporary digital SLR cameras), a target detection algorithm was developed for the mobile platform that, when combined with dedicated target patterns, was found to improve the quality of photogrammetric target measurement. Comparing the photogrammetric results with physical measurements, which were measured using a Zeiss P3 analytical plotter, the returned accuracy achieved was 1/67,000. The feasibility of the system has been proven through the implementation of an indoor simulation test and an outdoor experiment. In terms of using this system for actual structural monitoring applications, the optimal relative accuracy of distance measurement was determined to be approximately 1/28,000 under laboratory conditions, and the outdoor experiment returned a relative accuracy of approximately 1/16,400

    Implementation of a distributed real-time video panorama pipeline for creating high quality virtual views

    Get PDF
    Today, we are continuously looking for more immersive video systems. Such systems, however, require more content, which can be costly to produce. A full panorama, covering regions of interest, can contain all the information required, but can be difficult to view in its entirety. In this thesis, we discuss a method for creating virtual views from a cylindrical panorama, allowing multiple users to create individual virtual cameras from the same panorama video. We discuss how this method can be used for video delivery, but emphasize on the creation of the initial panorama. The panorama must be created in real-time, and with very high quality. We design and implement a prototype recording pipeline, installed at a soccer stadium, as a part of the Bagadus project. We describe a pipeline capable of producing 4K panorama videos from five HD cameras, in real-time, with possibilities for further upscaling. We explain how the cylindrical panorama can be created, with minimal computational cost and without visible seams. The cameras of our prototype system record video in the incomplete Bayer format, and we also investigate which debayering algorithms are best suited for recording multiple high resolution video streams in real-time

    Inverse tone mapping

    Get PDF
    The introduction of High Dynamic Range Imaging in computer graphics has produced a novelty in Imaging that can be compared to the introduction of colour photography or even more. Light can now be captured, stored, processed, and finally visualised without losing information. Moreover, new applications that can exploit physical values of the light have been introduced such as re-lighting of synthetic/real objects, or enhanced visualisation of scenes. However, these new processing and visualisation techniques cannot be applied to movies and pictures that have been produced by photography and cinematography in more than one hundred years. This thesis introduces a general framework for expanding legacy content into High Dynamic Range content. The expansion is achieved avoiding artefacts, producing images suitable for visualisation and re-lighting of synthetic/real objects. Moreover, it is presented a methodology based on psychophysical experiments and computational metrics to measure performances of expansion algorithms. Finally, a compression scheme, inspired by the framework, for High Dynamic Range Textures, is proposed and evaluated

    Design and implementation of real time image acquisition and processing systems

    Get PDF
    Nondestructive evaluation (NDE) is a way to characterize a material or a structure without compromising its usability. Generally, the inspection methods of NDE testing may be based on acoustics, penetrating radiation, light, electric and magnetic fields, or more special possibilities. Whatever methods are used in NDE, imaging technology is one of the important components for these systems. The rapid growth of sophisticated and low priced image acquisition and processing devices has opened up the possibility of applying imaging analysis to more NDE areas. Imaging technology is becoming a very powerful tool in NDE for material properties. The objective of this thesis is to develop a robust, open, easily extendable software platform for real time imaging acquisition and processing. This platform can support image format transform, histogram based look up table, real time image/slice display and device control integration. Three applications were implemented based on this platform. For Rapid Whole-Kernel Single-Seed Analyzer project, the special requirements for the CCD camera and Liquid Crystal Tunable Filter (LCTF) control were met. Multi-thread synchronization was used to cooperate between the CCD camera and the LCTF control. In order to speed up the whole image acquisition procedure, a predefined palette was used. The overlapping between the LCTF tuning time and image storing time made the whole data acquisition as fast as possible. This thesis also used the 14 bit cooled CCD camera to do radiographic digitization. Calibration, focusing, and distance measurement were implemented. The test showed the system could meet the basic requirements for radiographic digitization. In new X-ray Vision system, real time image/slice displaying under multi-video systems were developed. Image integration, averaging and subtracting were finished. It also provided a friendly user interface to motion control. Based on the integration of image acquisition and motion control, the automation of real-time scans was achieved. It is very flexible and can be used in complicated automatic scanning. The tests for the above three applications showed this platform has high stability and powerful functionality

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

    Real-Time Computational Gigapixel Multi-Camera Systems

    Get PDF
    The standard cameras are designed to truthfully mimic the human eye and the visual system. In recent years, commercially available cameras are becoming more complex, and offer higher image resolutions than ever before. However, the quality of conventional imaging methods is limited by several parameters, such as the pixel size, lens system, the diffraction limit, etc. The rapid technological advancements, increase in the available computing power, and introduction of Graphics Processing Units (GPU) and Field-Programmable-Gate-Arrays (FPGA) open new possibilities in the computer vision and computer graphics communities. The researchers are now focusing on utilizing the immense computational power offered on the modern processing platforms, to create imaging systems with novel or significantly enhanced capabilities compared to the standard ones. One popular type of the computational imaging systems offering new possibilities is a multi-camera system. This thesis will focus on FPGA-based multi-camera systems that operate in real-time. The aim of themulti-camera systems presented in this thesis is to offer a wide field-of-view (FOV) video coverage at high frame rates. The wide FOV is achieved by constructing a panoramic image from the images acquired by the multi-camera system. Two new real-time computational imaging systems that provide new functionalities and better performance compared to conventional cameras are presented in this thesis. Each camera system design and implementation are analyzed in detail, built and tested in real-time conditions. Panoptic is a miniaturized low-cost multi-camera system that reconstructs a 360 degrees view in real-time. Since it is an easily portable system, it provides means to capture the complete surrounding light field in dynamic environment, such as when mounted on a vehicle or a flying drone. The second presented system, GigaEye II , is a modular high-resolution imaging system that introduces the concept of distributed image processing in the real-time camera systems. This thesis explains in detail howsuch concept can be efficiently used in real-time computational imaging systems. The purpose of computational imaging systems in the form of multi-camera systems does not end with real-time panoramas. The application scope of these cameras is vast. They can be used in 3D cinematography, for broadcasting live events, or for immersive telepresence experience. The final chapter of this thesis presents three potential applications of these systems: object detection and tracking, high dynamic range (HDR) imaging, and observation of multiple regions of interest. Object detection and tracking, and observation of multiple regions of interest are extremely useful and desired capabilities of surveillance systems, in security and defense industry, or in the fast-growing industry of autonomous vehicles. On the other hand, high dynamic range imaging is becoming a common option in the consumer market cameras, and the presented method allows instantaneous capture of HDR videos. Finally, this thesis concludes with the discussion of the real-time multi-camera systems, their advantages, their limitations, and the future predictions
    corecore