745 research outputs found

    Automatic Image Registration in Infrared-Visible Videos using Polygon Vertices

    Full text link
    In this paper, an automatic method is proposed to perform image registration in visible and infrared pair of video sequences for multiple targets. In multimodal image analysis like image fusion systems, color and IR sensors are placed close to each other and capture a same scene simultaneously, but the videos are not properly aligned by default because of different fields of view, image capturing information, working principle and other camera specifications. Because the scenes are usually not planar, alignment needs to be performed continuously by extracting relevant common information. In this paper, we approximate the shape of the targets by polygons and use affine transformation for aligning the two video sequences. After background subtraction, keypoints on the contour of the foreground blobs are detected using DCE (Discrete Curve Evolution)technique. These keypoints are then described by the local shape at each point of the obtained polygon. The keypoints are matched based on the convexity of polygon's vertices and Euclidean distance between them. Only good matches for each local shape polygon in a frame, are kept. To achieve a global affine transformation that maximises the overlapping of infrared and visible foreground pixels, the matched keypoints of each local shape polygon are stored temporally in a buffer for a few number of frames. The matrix is evaluated at each frame using the temporal buffer and the best matrix is selected, based on an overlapping ratio criterion. Our experimental results demonstrate that this method can provide highly accurate registered images and that we outperform a previous related method

    Online Mutual Foreground Segmentation for Multispectral Stereo Videos

    Full text link
    The segmentation of video sequences into foreground and background regions is a low-level process commonly used in video content analysis and smart surveillance applications. Using a multispectral camera setup can improve this process by providing more diverse data to help identify objects despite adverse imaging conditions. The registration of several data sources is however not trivial if the appearance of objects produced by each sensor differs substantially. This problem is further complicated when parallax effects cannot be ignored when using close-range stereo pairs. In this work, we present a new method to simultaneously tackle multispectral segmentation and stereo registration. Using an iterative procedure, we estimate the labeling result for one problem using the provisional result of the other. Our approach is based on the alternating minimization of two energy functions that are linked through the use of dynamic priors. We rely on the integration of shape and appearance cues to find proper multispectral correspondences, and to properly segment objects in low contrast regions. We also formulate our model as a frame processing pipeline using higher order terms to improve the temporal coherence of our results. Our method is evaluated under different configurations on multiple multispectral datasets, and our implementation is available online.Comment: Preprint accepted for publication in IJCV (December 2018

    Extraplanar Dust in the Edge-On Spiral NGC 891

    Get PDF
    We present high-resolution (<0.65") optical broad-band images of the edge-on Sb galaxy NGC 891 obtained with the WIYN 3.5-m telescope. These BVR images reveal a complex network of hundreds of dust absorbing structures far from the mid-plane of the galaxy. The dust structures have a wide range of morphologies and are clearly visible to |z|<1.5 kpc from the mid-plane. In this paper we discuss the general characteristics of the population of absorbing structures, as well as physical properties of 12 individual features. These 12 structures are characterised by N_H >10^21 cm^-2, with masses estimated to be more than 2x10^5 - 5x10^6 solar masses, assuming Galactic gas-to-dust relationships. The gravitational potential energies of the individual dust structures, given their observed heights and derived masses, lie in the range of 20-200x10^51 ergs. Rough number counts of extraplanar dust features suggest the mass of high-z gas associated with extraplanar dust in NGC 891 likely exceeds 2x10^8 solar masses, or ~2% of the total neutral ISM mass of the galaxy. We discuss several mechanisms which may produce high-z dust structures such as those seen in the images presented here. It is not yet known which of these mechanisms are primarily responsible for the extensive extraplanar dust structures seen in our images. The data presented are part of a larger program to search for and characterize off-plane dust structures in edge-on systems. (Abstract Abridged)Comment: To appear in the Astronomical Journal: 37 pages, Latex; 9 separate figures; the paper and high-resolution figures are also available at http://www.astro.wisc.edu/~howk/Papers/papers.htm

    Overcoming the Challenges Associated with Image-based Mapping of Small Bodies in Preparation for the OSIRIS-REx Mission to (101955) Bennu

    Get PDF
    The OSIRIS-REx Asteroid Sample Return Mission is the third mission in NASA's New Frontiers Program and is the first U.S. mission to return samples from an asteroid to Earth. The most important decision ahead of the OSIRIS-REx team is the selection of a prime sample-site on the surface of asteroid (101955) Bennu. Mission success hinges on identifying a site that is safe and has regolith that can readily be ingested by the spacecraft's sampling mechanism. To inform this mission-critical decision, the surface of Bennu is mapped using the OSIRIS-REx Camera Suite and the images are used to develop several foundational data products. Acquiring the necessary inputs to these data products requires observational strategies that are defined specifically to overcome the challenges associated with mapping a small irregular body. We present these strategies in the context of assessing candidate sample-sites at Bennu according to a framework of decisions regarding the relative safety, sampleability, and scientific value across the asteroid's surface. To create data products that aid these assessments, we describe the best practices developed by the OSIRIS-REx team for image-based mapping of irregular small bodies. We emphasize the importance of using 3D shape models and the ability to work in body-fixed rectangular coordinates when dealing with planetary surfaces that cannot be uniquely addressed by body-fixed latitude and longitude.Comment: 31 pages, 10 figures, 2 table

    Video Registration for Multimodal Surveillance Systems

    Get PDF
    RÉSUMÉ Au cours de la dernière décennie, la conception et le déploiement de systèmes de surveillance par caméras thermiques et visibles pour l'analyse des activités humaines a retenu l'attention de la communauté de la vision par ordinateur. Les applications de l'imagerie thermique-visible pour l'analyse des activités humaines couvrent différents domaines, notamment la médecine, la sécurité à bord d'un véhicule et la sécurité des personnes. La motivation derrière un tel système est l'amélioration de la qualité des données dans le but ultime d'améliorer la performance du système de surveillance. Une difficulté fondamentale associée à un système d'imagerie thermique-visible est la mise en registre précise de caractéristiques et d'informations correspondantes à partir d'images avec des différences significatives dans les propriétés des signaux. Dans un cas, on capte des informations de couleur (lumière réfléchie) et dans l'autre cas, on capte la signature thermique (énergie émise). Ce problème est appelé mise en registre d'images et de séquences vidéo. La vidéosurveillance est l'un des domaines d'application le plus étendu de l'imagerie multi-spectrale. La vidéosurveillance automatique dans un environnement réel, que ce soit à l'intérieur ou à l'extérieur, est difficile en raison d'un nombre élevé de facteurs environnementaux tels que les variations d'éclairage, le vent, le brouillard, et les ombres. L'utilisation conjointe de différentes modalités permet d'augmenter la fiabilité des données d'entrée, et de révéler certaines informations sur la scène qui ne sont pas perceptibles par un système d'imagerie unimodal. Les premiers systèmes multimodaux de vidéosurveillance ont été conçus principalement pour des applications militaires. Mais de nos jours, en raison de la réduction du prix des caméras thermiques, ce sujet de recherche s'étend à des applications civiles ayant une variété d'objectifs. Les approches pour la mise en registre d'images pour un système multimodal de vidéosurveillance automatique sont divisées en deux catégories fondées sur la dimension de la scène: les approches qui sont appropriées pour des grandes scènes où les objets sont lointains, et les approches qui conviennent à de petites scènes où les objets sont près des caméras. Dans la littérature, ce sujet de recherche n'est pas bien documenté, en particulier pour le cas de petites scènes avec objets proches. Notre recherche est axée sur la conception de nouvelles solutions de mise en registre pour les deux catégories de scènes dans lesquels il y a plusieurs humains. Les solutions proposées sont incluses dans les quatre articles qui composent cette thèse. Nos méthodes de mise en registre sont des prétraitements pour d'autres tâches d'analyse vidéo telles que le suivi, la localisation de l'humain, l'analyse de comportements, et la catégorisation d'objets. Pour les scènes avec des objets lointains, nous proposons un système itératif qui fait de façon simultanée la mise en registre thermique-visible, la fusion des données et le suivi des personnes. Notre méthode de mise en registre est basée sur une mise en correspondance de trajectoires (en utilisant RANSAC) à partir desquelles on estime une matrice de transformation affine pour transformer globalement des objets d'avant-plan d'une image sur l'autre image. Notre système proposé de vidéosurveillance multimodale est basé sur un nouveau mécanisme de rétroaction entre la mise en registre et le module de suivi, ce qui augmente les performances des deux modules de manière itérative au fil du temps. Nos méthodes sont conçues pour des applications en ligne et aucune calibration des caméras ou de configurations particulières ne sont requises. Pour les petites scènes avec des objets proches, nous introduisons le descripteur Local Self-Similarity (LSS), comme une mesure de similarité viable pour mettre en correspondance les régions du corps humain dans des images thermiques et visibles. Nous avons également démontré théoriquement et quantitativement que LSS, comme mesure de similarité thermique-visible, est plus robuste aux différences entre les textures des régions correspondantes que l'information mutuelle (IM), qui est la mesure de similarité classique pour les applications multimodales. D'autres descripteurs viables, y compris Histogram Of Gradient (HOG), Scale Invariant Feature Transform (SIFT), et Binary Robust Independent Elementary Feature (BRIEF) sont également surclassés par LSS. En outre, nous proposons une approche de mise en registre utilisant LSS et un mécanisme de votes pour obtenir une carte de disparité stéréo dense pour chaque région d'avant-plan dans l'image. La carte de disparité qui en résulte peut alors être utilisée pour aligner l'image de référence sur la seconde image. Nous démontrons que notre méthode surpasse les méthodes dans l'état de l'art, notamment les méthodes basées sur l'information mutuelle. Nos expériences ont été réalisées en utilisant des scénarios réalistes de surveillance d'humains dans une scène de petite taille. En raison des lacunes des approches locales de correspondance stéréo pour l'estimation de disparités précises dans des régions de discontinuité de profondeur, nous proposons une méthode de correspondance stéréo basée sur une approche d'optimisation globale. Nous introduisons un modèle stéréo approprié pour la mise en registre d'images thermique-visible en utilisant une méthode de minimisation de l'énergie en conjonction avec la méthode Belief Propagation (BP) comme méthode pour optimiser l'affectation des disparités par une fonction d'énergie. Dans cette méthode, nous avons intégré les informations de couleur et de mouvement comme contraintes douces pour améliorer la précision d'affectation des disparités dans les cas de discontinuités de profondeur. Bien que les approches de correspondance globale soient plus gourmandes au niveau des ressources de calculs par rapport aux approches de correspondance locale basée sur la stratégie Winner Take All (WTA), l'algorithme efficace BP et la programmation parallèle (OpenMP) en C++ que nous avons utilisés dans notre implémentation, permettent d'accélérer le temps de traitement de manière significative et de rendre nos méthodes viables pour les applications de vidéosurveillance. Nos méthodes sont programmées en C++ et utilisent la bibliothèque OpenCV. Nos méthodes sont conçues pour être facilement intégrées comme prétraitement pour toute application d'analyse vidéo. En d'autres termes, les données d'entrée de nos méthodes pourraient être un flux vidéo en ligne, et pour une analyse plus approfondie, un nouveau module pourrait être ajouté en aval à notre schéma algorithmique. Cette analyse plus approfondie pourrait être le suivi d'objets, la localisation d'êtres humains, et l'analyse de trajectoires pour les applications de surveillance multimodales de grandes scène. Aussi, Il pourrait être l'analyse de comportements, la catégorisation d'objets, et le suivi pour les applications sur des scènes de tailles réduites.---------ABSTRACT Recently, the design and deployment of thermal-visible surveillance systems for human analysis attracted a lot of attention in the computer vision community. Thermal-visible imagery applications for human analysis span different domains including medical, in-vehicle safety system, and surveillance. The motivation of applying such a system is improving the quality of data with the ultimate goal of improving the performance of targeted surveillance system. A fundamental issue associated with a thermal-visible imaging system is the accurate registration of corresponding features and information from images with high differences in imaging characteristics, where one reflects the color information (reflected energy) and another one reflects thermal signature (emitted energy). This problem is named Image/video registration. Video surveillance is one of the most extensive application domains of multispectral imaging. Automatic video surveillance in a realistic environment, either indoor or outdoor, is difficult due to the unlimited number of environmental factors such as illumination variations, wind, fog, and shadows. In a multimodal surveillance system, the joint use of different modalities increases the reliability of input data and reveals some information of the scene that might be missed using a unimodal imaging system. The early multimodal video surveillance systems were designed mainly for military applications. But nowadays, because of the reduction in the price of thermal cameras, this subject of research is extending to civilian applications and has attracted more interests for a variety of the human monitoring objectives. Image registration approaches for an automatic multimodal video surveillance system are divided into two general approaches based on the range of captured scene: the approaches that are appropriate for long-range scenes, and the approaches that are suitable for close-range scenes. In the literature, this subject of research is not well documented, especially for close-range surveillance application domains. Our research is focused on novel image registration solutions for both close-range and long-range scenes featuring multiple humans. The proposed solutions are presented in the four articles included in this thesis. Our registration methods are applicable for further video analysis such as tracking, human localization, behavioral pattern analysis, and object categorization. For far-range video surveillance, we propose an iterative system that consists of simultaneous thermal-visible video registration, sensor fusion, and people tracking. Our video registration is based on a RANSAC object trajectory matching, which estimates an affine transformation matrix to globally transform foreground objects of one image on another one. Our proposed multimodal surveillance system is based on a novel feedback scheme between registration and tracking modules that augments the performance of both modules iteratively over time. Our methods are designed for online applications and no camera calibration or special setup is required. For close-range video surveillance applications, we introduce Local Self-Similarity (LSS) as a viable similarity measure for matching corresponding human body regions of thermal and visible images. We also demonstrate theoretically and quantitatively that LSS, as a thermal-visible similarity measure, is more robust to differences between corresponding regions' textures than the Mutual Information (MI), which is the classic multimodal similarity measure. Other viable local image descriptors including Histogram Of Gradient (HOG), Scale Invariant Feature Transform (SIFT), and Binary Robust Independent Elementary Feature (BRIEF) are also outperformed by LSS. Moreover, we propose a LSS-based dense local stereo correspondence algorithm based on a voting approach, which estimates a dense disparity map for each foreground region in the image. The resulting disparity map can then be used to align the reference image on the second image. We demonstrate that our proposed LSS-based local registration method outperforms similar state-of-the-art MI-based local registration methods in the literature. Our experiments were carried out using realistic human monitoring scenarios in a close-range scene. Due to the shortcomings of local stereo correspondence approaches for estimating accurate disparities in depth discontinuity regions, we propose a novel stereo correspondence method based on a global optimization approach. We introduce a stereo model appropriate for thermal-visible image registration using an energy minimization framework and Belief Propagation (BP) as a method to optimize the disparity assignment via an energy function. In this method, we integrated color and motion visual cues as a soft constraint into an energy function to improve disparity assignment accuracy in depth discontinuities. Although global correspondence approaches are computationally more expensive compared to Winner Take All (WTA) local correspondence approaches, the efficient BP algorithm and parallel processing programming (openMP) in C++ that we used in our implementation, speed up the processing time significantly and make our methods viable for video surveillance applications. Our methods are implemented in C++ using OpenCV library and object-oriented programming. Our methods are designed to be integrated easily for further video analysis. In other words, the input data of our methods could come from two synchronized online video streams. For further analysis a new module could be added in our frame-by-frame algorithmic diagram. Further analysis might be object tracking, human localization, and trajectory pattern analysis for multimodal long-range monitoring applications, and behavior pattern analysis, object categorization, and tracking for close-range applications

    Leveraging Image Analysis to Compute 3D Plant Phenotypes Based on Voxel-Grid Plant Reconstruction

    Get PDF
    High throughput image-based plant phenotyping facilitates the extraction of morphological and biophysical traits of a large number of plants non-invasively in a relatively short time. It facilitates the computation of advanced phenotypes by considering the plant as a single object (holistic phenotypes) or its components, i.e., leaves and the stem (component phenotypes). The architectural complexity of plants increases over time due to variations in self-occlusions and phyllotaxy, i.e., arrangements of leaves around the stem. One of the central challenges to computing phenotypes from 2-dimensional (2D) single view images of plants, especially at the advanced vegetative stage in presence of self-occluding leaves, is that the information captured in 2D images is incomplete, and hence, the computed phenotypes are inaccurate. We introduce a novel algorithm to compute 3-dimensional (3D) plant phenotypes from multiview images using voxel-grid reconstruction of the plant (3DPhenoMV). The paper also presents a novel method to reliably detect and separate the individual leaves and the stem from the 3D voxel-grid of the plant using voxel overlapping consistency check and point cloud clustering techniques. To evaluate the performance of the proposed algorithm, we introduce the University of Nebraska-Lincoln 3D Plant Phenotyping Dataset (UNL-3DPPD). A generic taxonomy of 3D image-based plant phenotypes are also presented to promote 3D plant phenotyping research. A subset of these phenotypes are computed using computer vision algorithms with discussion of their significance in the context of plant science. The central contributions of the paper are (a) an algorithm for 3D voxel-grid reconstruction of maize plants at the advanced vegetative stages using images from multiple 2D views; (b) a generic taxonomy of 3D image-based plant phenotypes and a public benchmark dataset, i.e., UNL-3DPPD, to promote the development of 3D image-based plant phenotyping research; and (c) novel voxel overlapping consistency check and point cloud clustering techniques to detect and isolate individual leaves and stem of the maize plants to compute the component phenotypes. Detailed experimental analyses demonstrate the efficacy of the proposed method, and also show the potential of 3D phenotypes to explain the morphological characteristics of plants regulated by genetic and environmental interactions

    Quantifying Effusion Rates at Active Volcanoes through Integrated Time-Lapse Laser Scanning and Photography

    Get PDF
    During volcanic eruptions, measurements of the rate at which magma is erupted underpin hazard assessments. For eruptions dominated by the effusion of lava, estimates are often made using satellite data; here, in a case study at Mount Etna (Sicily), we make the first measurements based on terrestrial laser scanning (TLS), and we also include explosive products. During the study period (17–21 July, 2012), regular strombolian explosions were occurring within the Bocca Nuova crater, producing a ~50 m high scoria cone and a small lava flow field. TLS surveys over multi-day intervals determined a mean cone growth rate (effusive and explosive products) of ~0.24 m3s-1. Differences between 0.3-m-resolution DEMs acquired at 10-minute intervals captured the evolution of a breakout lava flow lobe advancing at 0.01–0.03 m3s-1. Partial occlusion within the crater prevented similar measurement of the main flow, but integrating TLS data with time-lapse imagery enabled lava viscosity (7.4 × 105 Pa s) to be derived from surface velocities and, hence, a flux of 0.11 m3s-1 to be calculated. The total dense-rock equivalent magma discharge estimates range from ~0.1 to ~0.2 m3s-1 over the measurement period, and suggest that simultaneous estimates from satellite data are somewhat overestimated. Our results support the use of integrated TLS and time-lapse photography for ground-truthing space-based measurements and highlight the value of interactive image analysis when automated approaches such as particle image velocimetry (PIV) fail
    corecore