176 research outputs found

    Advanced editing methods for image and video sequences

    Get PDF
    In the context of image and video editing, this thesis proposes methods for modifying the semantic content of a recorded scene. Two different editing problems are approached: First, the removal of ghosting artifacts from high dynamic range (HDR) images recovered from exposure sequences, and second, the removal of objects from video sequences recorded with and without camera motion. These editings need to be performed in a way that the result looks plausible to humans, but without having to recover detailed models about the content of the scene, e.g. its geometry, reflectance, or illumination. The proposed editing methods add new key ingredients, such as camera noise models and global optimization frameworks, that help achieving results that surpass the capabilities of state-of-the-art methods. Using these ingredients, each proposed method defines local visual properties that approximate well the specific editing requirements of each task. These properties are then encoded into a energy function that, when globally minimized, produces the required editing results. The optimization of such energy functions corresponds to Bayesian inference problems that are solved efficiently using graph cuts. The proposed methods are demonstrated to outperform other state-ofthe-art methods. Furthermore, they are demonstrated to work well on complex real-world scenarios that have not been previously addressed in the literature, i.e., highly cluttered scenes for HDR deghosting, and highly dynamic scenes and unconstraint camera motion for object removal from videos.Diese Arbeit schlägt Methoden zur Änderung des semantischen Inhalts einer aufgenommenen Szene im Kontext der Bild-und Videobearbeitung vor. Zwei unterschiedliche Bearbeitungsmethoden werden angesprochen: Erstens, das Entfernen von Ghosting Artifacts (Geist-ähnliche Artefakte) aus High Dynamic Range (HDR) Bildern welche von Belichtungsreihen erstellt wurden und zweitens, das Entfernen von Objekten aus Videosequenzen mit und ohne Kamerabewegung. Das Bearbeiten muss in einer Weise durchgeführt werden, dass das Ergebnis für den Menschen plausibel aussieht, aber ohne das detaillierte Modelle des Szeneninhalts rekonstruiert werden müssen, z.B. die Geometrie, das Reflexionsverhalten, oder Beleuchtungseigenschaften. Die vorgeschlagenen Bearbeitungsmethoden beinhalten neuartige Elemente, etwa Kameralärm-Modelle und globale Optimierungs-Systeme, mit deren Hilfe es möglich ist die Eigenschaften der modernsten existierenden Methoden zu übertreffen. Mit Hilfe dieser Elemente definieren die vorgeschlagenen Methoden lokale visuelle Eigenschaften welche die beschriebenen Bearbeitungsmethoden gut annähern. Diese Eigenschaften werden dann als Energiefunktion codiert, welche, nach globalem minimieren, die gewünschten Bearbeitung liefert. Die Optimierung solcher Energiefunktionen entspricht dem Bayes’schen Inferenz Modell welches effizient mittels Graph-Cut Algorithmen gelöst werden kann. Es wird gezeigt, dass die vorgeschlagenen Methoden den heutigen Stand der Technik übertreffen. Darüber hinaus sind sie nachweislich gut auf komplexe natürliche Szenarien anwendbar, welche in der existierenden Literatur bisher noch nicht angegangen wurden, d.h. sehr unübersichtliche Szenen für HDR Deghosting und sehr dynamische Szenen und unbeschränkte Kamerabewegungen für das Entfernen von Objekten aus Videosequenzen

    The IPAC Image Subtraction and Discovery Pipeline for the intermediate Palomar Transient Factory

    Get PDF
    We describe the near real-time transient-source discovery engine for the intermediate Palomar Transient Factory (iPTF), currently in operations at the Infrared Processing and Analysis Center (IPAC), Caltech. We coin this system the IPAC/iPTF Discovery Engine (or IDE). We review the algorithms used for PSF-matching, image subtraction, detection, photometry, and machine-learned (ML) vetting of extracted transient candidates. We also review the performance of our ML classifier. For a limiting signal-to-noise ratio of 4 in relatively unconfused regions, "bogus" candidates from processing artifacts and imperfect image subtractions outnumber real transients by ~ 10:1. This can be considerably higher for image data with inaccurate astrometric and/or PSF-matching solutions. Despite this occasionally high contamination rate, the ML classifier is able to identify real transients with an efficiency (or completeness) of ~ 97% for a maximum tolerable false-positive rate of 1% when classifying raw candidates. All subtraction-image metrics, source features, ML probability-based real-bogus scores, contextual metadata from other surveys, and possible associations with known Solar System objects are stored in a relational database for retrieval by the various science working groups. We review our efforts in mitigating false-positives and our experience in optimizing the overall system in response to the multitude of science projects underway with iPTF.Comment: 66 pages, 21 figures, 7 tables, accepted by PAS

    Real-Time Algorithms for High Dynamic Range Video

    Full text link
    A recurring problem in capturing video is the scene having a range of brightness values that exceeds the capabilities of the capturing device. An example would be a video camera in a bright outside area, directed at the entrance of a building. Because of the potentially big brightness difference, it may not be possible to capture details of the inside of the building and the outside simultaneously using just one shutter speed setting. This results in under- and overexposed pixels in the video footage. The approach we follow in this thesis to overcome this problem is temporal exposure bracketing, i.e., using a set of images captured in quick sequence at different shutter settings. Each image then captures one facet of the scene's brightness range. When fused together, a high dynamic range (HDR) video frame is created that reveals details in dark and bright regions simultaneously. The process of creating a frame in an HDR video can be thought of as a pipeline where the output of each step is the input to the subsequent one. It begins by capturing a set of regular images using varying shutter speeds. Next, the images are aligned with respect to each other to compensate for camera and scene motion during capture. The aligned images are then merged together to create a single HDR frame containing accurate brightness values of the entire scene. As a last step, the HDR frame is tone mapped in order to be displayable on a regular screen with a lower dynamic range. This thesis covers algorithms for these steps that allow the creation of HDR video in real-time. When creating videos instead of still images, the focus lies on high capturing and processing speed and on assuring temporal consistency between the video frames. In order to achieve this goal, we take advantage of the knowledge gained from the processing of previous frames in the video. This work addresses the following aspects in particular. The image size parameters for the set of base images are chosen such that only as little image data as possible is captured. We make use of the fact that it is not always necessary to capture full size images when only small portions of the scene require HDR. Avoiding redundancy in the image material is an obvious approach to reducing the overall time taken to generate a frame. With the aid of the previous frames, we calculate brightness statistics of the scene. The exposure values are chosen in a way, such that frequently occurring brightness values are well-exposed in at least one of the images in the sequence. The base images from which the HDR frame is created are captured in quick succession. The effects of intermediate camera motion are thus less intense than in the still image case, and a comparably simpler camera motion model can be used. At the same time, however, there is much less time available to estimate motion. For this reason, we use a fast heuristic that makes use of the motion information obtained in previous frames. It is robust to the large brightness difference between the images of an exposure sequence. The range of luminance values of an HDR frame must be tone mapped to the displayable range of the output device. Most available tone mapping operators are designed for still images and scale the dynamic range of each frame independently. In situations where the scene's brightness statistics change quickly, these operators produce visible image flicker. We have developed an algorithm that detects such situations in an HDR video. Based on this detection, a temporal stability criterion for the tone mapping parameters then prevents image flicker. All methods for capture, creation and display of HDR video introduced in this work have been fully implemented, tested and integrated into a running HDR video system. The algorithms were analyzed for parallelizability and, if applicable, adjusted and implemented on a high-performance graphics chip

    Multiexposure and multifocus image fusion with multidimensional camera shake compensation

    Get PDF
    Multiexposure image fusion algorithms are used for enhancing the perceptual quality of an image captured by sensors of limited dynamic range. This is achieved by rendering a single scene based on multiple images captured at different exposure times. Similarly, multifocus image fusion is used when the limited depth of focus on a selected focus setting of a camera results in parts of an image being out of focus. The solution adopted is to fuse together a number of multifocus images to create an image that is focused throughout. A single algorithm that can perform both multifocus and multiexposure image fusion is proposed. This algorithm is a new approach in which a set of unregistered multiexposure focus images is first registered before being fused to compensate for the possible presence of camera shake. The registration of images is done via identifying matching key-points in constituent images using scale invariant feature transforms. The random sample consensus algorithm is used to identify inliers of SIFT key-points removing outliers that can cause errors in the registration process. Finally, the coherent point drift algorithm is used to register the images, preparing them to be fused in the subsequent fusion stage. For the fusion of images, a new approach based on an improved version of a wavelet-based contourlet transform is used. The experimental results and the detailed analysis presented prove that the proposed algorithm is capable of producing high-dynamic range (HDR) or multifocus images by registering and fusing a set of multiexposure or multifocus images taken in the presence of camera shake. Further,comparison of the performance of the proposed algorithm with a number of state-of-the art algorithms and commercial software packages is provided. In particular, our literature review has revealed that this is one of the first attempts where the compensation of camera shake, a very likely practical problem that can result in HDR image capture using handheld devices, has been addressed as a part of a multifocus and multiexposure image enhancement system. © 2013 Society of Photo-Optical Instrumentatio Engineers (SPIE)

    High Dynamic Range Spectral Imaging Pipeline For Multispectral Filter Array Cameras

    Get PDF
    Spectral filter arrays imaging exhibits a strong similarity with color filter arrays. This permits us to embed this technology in practical vision systems with little adaptation of the existing solutions. In this communication, we define an imaging pipeline that permits high dynamic range (HDR)-spectral imaging, which is extended from color filter arrays. We propose an implementation of this pipeline on a prototype sensor and evaluate the quality of our implementation results on real data with objective metrics and visual examples. We demonstrate that we reduce noise, and, in particular we solve the problem of noise generated by the lack o

    Super resolution and dynamic range enhancement of image sequences

    Get PDF
    Camera producers try to increase the spatial resolution of a camera by reducing size of sites on sensor array. However, shot noise causes the signal to noise ratio drop as sensor sites get smaller. This fact motivates resolution enhancement to be performed through software. Super resolution (SR) image reconstruction aims to combine degraded images of a scene in order to form an image which has higher resolution than all observations. There is a demand for high resolution images in biomedical imaging, surveillance, aerial/satellite imaging and high-definition TV (HDTV) technology. Although extensive research has been conducted in SR, attention has not been given to increase the resolution of images under illumination changes. In this study, a unique framework is proposed to increase the spatial resolution and dynamic range of a video sequence using Bayesian and Projection onto Convex Sets (POCS) methods. Incorporating camera response function estimation into image reconstruction allows dynamic range enhancement along with spatial resolution improvement. Photometrically varying input images complicate process of projecting observations onto common grid by violating brightness constancy. A contrast invariant feature transform is proposed in this thesis to register input images with high illumination variation. Proposed algorithm increases the repeatability rate of detected features among frames of a video. Repeatability rate is increased by computing the autocorrelation matrix using the gradients of contrast stretched input images. Presented contrast invariant feature detection improves repeatability rate of Harris corner detector around %25 on average. Joint multi-frame demosaicking and resolution enhancement is also investigated in this thesis. Color constancy constraint set is devised and incorporated into POCS framework for increasing resolution of color-filter array sampled images. Proposed method provides fewer demosaicking artifacts compared to existing POCS method and a higher visual quality in final image

    Improving SLI Performance in Optically Challenging Environments

    Get PDF
    The construction of 3D models of real-world scenes using non-contact methods is an important problem in computer vision. Some of the more successful methods belong to a class of techniques called structured light illumination (SLI). While SLI methods are generally very successful, there are cases where their performance is poor. Examples include scenes with a high dynamic range in albedo or scenes with strong interreflections. These scenes are referred to as optically challenging environments. The work in this dissertation is aimed at improving SLI performance in optically challenging environments. A new method of high dynamic range imaging (HDRI) based on pixel-by-pixel Kalman filtering is developed. Using objective metrics, it is show to achieve as much as a 9.4 dB improvement in signal-to-noise ratio and as much as a 29% improvement in radiometric accuracy over a classic method. Quality checks are developed to detect and quantify multipath interference and other quality defects using phase measuring profilometry (PMP). Techniques are established to improve SLI performance in the presence of strong interreflections. Approaches in compressed sensing are applied to SLI, and interreflections in a scene are modeled using SLI. Several different applications of this research are also discussed

    Exposure Fusion Using Boosting Laplacian Pyramid

    Get PDF
    Abstract-This paper proposes a new exposure fusion approach for producing a high quality image result from multiple exposure images. Based on the local weight and global weight by considering the exposure quality measurement between different exposure images, and the just noticeable distortion-based saliency weight, a novel hybrid exposure weight measurement is developed. This new hybrid weight is guided not only by a single image's exposure level but also by the relative exposure level between different exposure images. The core of the approach is our novel boosting Laplacian pyramid, which is based on the structure of boosting the detail and base signal, respectively, and the boosting process is guided by the proposed exposure weight. Our approach can effectively blend the multiple exposure images for static scenes while preserving both color appearance and texture structure. Our experimental results demonstrate that the proposed approach successfully produces visually pleasing exposure fusion images with better color appearance and more texture details than the existing exposure fusion techniques and tone mapping operators. Index Terms-Boosting Laplacian pyramid, exposure fusion, global and local exposure weight, gradient vector

    First results from high redshift quasar searches in VIKING

    Get PDF
    This thesis presents the discovery of the first luminous z & 6.5 quasars in the VISTA kilo-degree Infrared Galaxy Survey (VIKING). After some basic quality control, quasar selection is investigated via use of initial data supplementedwith detailedmodelling of the photometric and spatial distributions of stars of spectral type M, L and T, which are known to be the cause of significant contamination in quasar colour selection spaces. Optimised selection constraints are placed on detection significance and morphology and the performance of a traditional colour selection technique is compared to a Bayesian model comparison technique. The latter is found to offer a ∼10 per cent gain in completeness over traditional colour selection. Quasar candidates are ranked via Bayesian model comparison and a subset of the highest ranked objects are put forward for follow-up imaging. In June 2011, 44 high-z quasar candidates underwent deep optical i- and z- band imaging on the ESO NTT. Just 6 of these candidates were found to have optical colours consistent z & 6.5 quasars. Spectroscopic follow-up of these objects is ongoing, but thus far three new quasars have been discovered at redshifts of z=6.5, 6.7, 6.9. This discovery rate is consistentwith zero evolution in the rate of decline in quasar space density from z & 6.4. This differs fromthe latest results from UKIDSS. Further results expected from these and other surveys will begin to constrain the true nature of quasar space density evolution in the near future. The discovery of three z ≥ 6.5 quasars in VIKING is a significant highlight in the first year of VISTA science operations. These quasars will remain important probes of the high-z universe throughout the next decade.
    • …
    corecore