Search CORE

312 research outputs found

Automatic Image Registration in Infrared-Visible Videos using Polygon Vertices

Author: Bilodeau Guillaume-Alexandre
Chakravorty Tanushri
Granger Eric
Publication venue
Publication date: 01/01/2014
Field of study

In this paper, an automatic method is proposed to perform image registration in visible and infrared pair of video sequences for multiple targets. In multimodal image analysis like image fusion systems, color and IR sensors are placed close to each other and capture a same scene simultaneously, but the videos are not properly aligned by default because of different fields of view, image capturing information, working principle and other camera specifications. Because the scenes are usually not planar, alignment needs to be performed continuously by extracting relevant common information. In this paper, we approximate the shape of the targets by polygons and use affine transformation for aligning the two video sequences. After background subtraction, keypoints on the contour of the foreground blobs are detected using DCE (Discrete Curve Evolution)technique. These keypoints are then described by the local shape at each point of the obtained polygon. The keypoints are matched based on the convexity of polygon's vertices and Euclidean distance between them. Only good matches for each local shape polygon in a frame, are kept. To achieve a global affine transformation that maximises the overlapping of infrared and visible foreground pixels, the matched keypoints of each local shape polygon are stored temporally in a buffer for a few number of frames. The matrix is evaluated at each frame using the temporal buffer and the best matrix is selected, based on an overlapping ratio criterion. Our experimental results demonstrate that this method can provide highly accurate registered images and that we outperform a previous related method

arXiv.org e-Print Archive

PolyPublie

3D photogrammetric data modeling and optimization for multipurpose analysis and representation of Cultural Heritage assets

Author: DONADIO ELISABETTA
Publication venue: country:Italy
Publication date: 29/08/2018
Field of study

This research deals with the issues concerning the processing, managing, representation for further dissemination of the big amount of 3D data today achievable and storable with the modern geomatic techniques of 3D metric survey. In particular, this thesis is focused on the optimization process applied to 3D photogrammetric data of Cultural Heritage assets. Modern Geomatic techniques enable the acquisition and storage of a big amount of data, with high metric and radiometric accuracy and precision, also in the very close range field, and to process very detailed 3D textured models. Nowadays, the photogrammetric pipeline has well-established potentialities and it is considered one of the principal technique to produce, at low cost, detailed 3D textured models. The potentialities offered by high resolution and textured 3D models is today well-known and such representations are a powerful tool for many multidisciplinary purposes, at different scales and resolutions, from documentation, conservation and restoration to visualization and education. For example, their sub-millimetric precision makes them suitable for scientific studies applied to the geometry and materials (i.e. for structural and static tests, for planning restoration activities or for historical sources); their high fidelity to the real object and their navigability makes them optimal for web-based visualization and dissemination applications. Thanks to the improvement made in new visualization standard, they can be easily used as visualization interface linking different kinds of information in a highly intuitive way. Furthermore, many museums look today for more interactive exhibitions that may increase the visitors’ emotions and many recent applications make use of 3D contents (i.e. in virtual or augmented reality applications and through virtual museums). What all of these applications have to deal with concerns the issue deriving from the difficult of managing the big amount of data that have to be represented and navigated. Indeed, reality based models have very heavy file sizes (also tens of GB) that makes them difficult to be handled by common and portable devices, published on the internet or managed in real time applications. Even though recent advances produce more and more sophisticated and capable hardware and internet standards, empowering the ability to easily handle, visualize and share such contents, other researches aim at define a common pipeline for the generation and optimization of 3D models with a reduced number of polygons, however able to satisfy detailed radiometric and geometric requests. iii This thesis is inserted in this scenario and focuses on the 3D modeling process of photogrammetric data aimed at their easy sharing and visualization. In particular, this research tested a 3D models optimization, a process which aims at the generation of Low Polygons models, with very low byte file size, processed starting from the data of High Poly ones, that nevertheless offer a level of detail comparable to the original models. To do this, several tools borrowed from the game industry and game engine have been used. For this test, three case studies have been chosen, a modern sculpture of a contemporary Italian artist, a roman marble statue, preserved in the Civic Archaeological Museum of Torino, and the frieze of the Augustus arch preserved in the city of Susa (Piedmont- Italy). All the test cases have been surveyed by means of a close range photogrammetric acquisition and three high detailed 3D models have been generated by means of a Structure from Motion and image matching pipeline. On the final High Poly models generated, different optimization and decimation tools have been tested with the final aim to evaluate the quality of the information that can be extracted by the final optimized models, in comparison to those of the original High Polygon one. This study showed how tools borrowed from the Computer Graphic offer great potentialities also in the Cultural Heritage field. This application, in fact, may meet the needs of multipurpose and multiscale studies, using different levels of optimization, and this procedure could be applied to different kind of objects, with a variety of different sizes and shapes, also on multiscale and multisensor data, such as buildings, architectural complexes, data from UAV surveys and so on

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

3D head motion, point-of-regard and encoded gaze fixations in real scenes: next-generation portable video-based monocular eye tracking

Author: Munn Susan M.
Publication venue: RIT Scholar Works
Publication date: 10/08/2009
Field of study

Portable eye trackers allow us to see where a subject is looking when performing a natural task with free head and body movements. These eye trackers include headgear containing a camera directed at one of the subject\u27s eyes (the eye camera) and another camera (the scene camera) positioned above the same eye directed along the subject\u27s line-of-sight. The output video includes the scene video with a crosshair depicting where the subject is looking -- the point-of-regard (POR) -- that is updated for each frame. This video may be the desired final result or it may be further analyzed to obtain more specific information about the subject\u27s visual strategies. A list of the calculated POR positions in the scene video can also be analyzed. The goals of this project are to expand the information that we can obtain from a portable video-based monocular eye tracker and to minimize the amount of user interaction required to obtain and analyze this information. This work includes offline processing of both the eye and scene videos to obtain robust 2D PORs in scene video frames, identify gaze fixations from these PORs, obtain 3D head motion and ray trace fixations through volumes-of-interest (VOIs) to determine what is being fixated, when and where (3D POR). To avoid the redundancy of ray tracing a 2D POR in every video frame and to group these POR data meaningfully, a fixation-identification algorithm is employed to simplify the long list of 2D POR data into gaze fixations. In order to ray trace these fixations, the 3D motion -- position and orientation over time -- of the scene camera is computed. This camera motion is determined via an iterative structure and motion recovery algorithm that requires a calibrated camera and knowledge of the 3D location of at least four points in the scene (that can be selected from premeasured VOI vertices). The subjects 3D head motion is obtained directly from this camera motion. For the final stage of the algorithm, the 3D locations and dimensions of VOIs in the scene are required. This VOI information in world coordinates is converted to camera coordinates for ray tracing. A representative 2D POR position for each fixation is converted from image coordinates to the same camera coordinate system. Then, a ray is traced from the camera center through this position to determine which (if any) VOI is being fixated and where it is being fixated -- the 3D POR in the world. Results are presented for various real scenes. Novel visualizations of portable eye tracker data created using the results of our algorithm are also presented

RIT Scholar Works

Rõivaste tekstureerimine kasutades Kinect V2.0

Author: Avots Egils
Publication venue: Tartu Ülikool
Publication date: 01/01/2017
Field of study

This thesis describes three new garment retexturing methods for FitsMe virtual fitting room applications using data from Microsoft Kinect II RGB-D camera. The first method, which is introduced, is an automatic technique for garment retexturing using a single RGB-D image and infrared information obtained from Kinect II. First, the garment is segmented out from the image using GrabCut or depth segmentation. Then texture domain coordinates are computed for each pixel belonging to the garment using normalized 3D information. Afterwards, shading is applied to the new colors from the texture image. The second method proposed in this work is about 2D to 3D garment retexturing where a segmented garment of a manikin or person is matched to a new source garment and retextured, resulting in augmented images in which the new source garment is transferred to the manikin or person. The problem is divided into garment boundary matching based on point set registration which uses Gaussian mixture models and then interpolate inner points using surface topology extracted through geodesic paths, which leads to a more realistic result than standard approaches. The final contribution of this thesis is by introducing another novel method which is used for increasing the texture quality of a 3D model of a garment, by using the same Kinect frame sequence which was used in the model creation. Firstly, a structured mesh must be created from the 3D model, therefore the 3D model is wrapped to a base model with defined seams and texture map. Afterwards frames are matched to the newly created model and by process of ray casting the color values of the Kinect frames are mapped to the UV map of the 3D model

DSpace at Tartu University Library

Digital 3D documentation of cultural heritage sites based on terrestrial laser scanning

Author: Dutescu Eugen
Publication venue: Universität der Bundeswehr München, Fakultät für Bauingenieurwesen und Umweltwissenschaften
Publication date: 01/01/2006
Field of study

Universität der Bundeswehr München: AtheneForschung

Computationally efficient deformable 3D object tracking with a monocular RGB camera

Author: Goenetxea Imaz Jon
Publication venue
Publication date: 17/12/2020
Field of study

182 p.Monocular RGB cameras are present in most scopes and devices, including embedded environments like robots, cars and home automation. Most of these environments have in common a significant presence of human operators with whom the system has to interact. This context provides the motivation to use the captured monocular images to improve the understanding of the operator and the surrounding scene for more accurate results and applications.However, monocular images do not have depth information, which is a crucial element in understanding the 3D scene correctly. Estimating the three-dimensional information of an object in the scene using a single two-dimensional image is already a challenge. The challenge grows if the object is deformable (e.g., a human body or a human face) and there is a need to track its movements and interactions in the scene.Several methods attempt to solve this task, including modern regression methods based on Deep NeuralNetworks. However, despite the great results, most are computationally demanding and therefore unsuitable for several environments. Computational efficiency is a critical feature for computationally constrained setups like embedded or onboard systems present in robotics and automotive applications, among others.This study proposes computationally efficient methodologies to reconstruct and track three-dimensional deformable objects, such as human faces and human bodies, using a single monocular RGB camera. To model the deformability of faces and bodies, it considers two types of deformations: non-rigid deformations for face tracking, and rigid multi-body deformations for body pose tracking. Furthermore, it studies their performance on computationally restricted devices like smartphones and onboard systems used in the automotive industry. The information extracted from such devices gives valuable insight into human behaviour a crucial element in improving human-machine interaction.We tested the proposed approaches in different challenging application fields like onboard driver monitoring systems, human behaviour analysis from monocular videos, and human face tracking on embedded devices

Archivo Digital para la Docencia y la Investigación

Computationally efficient deformable 3D object tracking with a monocular RGB camera

Author: Goenetxea Imaz Jon
Publication venue
Publication date: 17/12/2020
Field of study

Archivo Digital para la Docencia y la Investigación

Methods for Real-time Visualization and Interaction with Landforms

Author: Schneider Martin
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

This thesis presents methods to enrich data modeling and analysis in the geoscience domain with a particular focus on geomorphological applications. First, a short overview of the relevant characteristics of the used remote sensing data and basics of its processing and visualization are provided. Then, two new methods for the visualization of vector-based maps on digital elevation models (DEMs) are presented. The first method uses a texture-based approach that generates a texture from the input maps at runtime taking into account the current viewpoint. In contrast to that, the second method utilizes the stencil buffer to create a mask in image space that is then used to render the map on top of the DEM. A particular challenge in this context is posed by the view-dependent level-of-detail representation of the terrain geometry. After suitable visualization methods for vector-based maps have been investigated, two landform mapping tools for the interactive generation of such maps are presented. The user can carry out the mapping directly on the textured digital elevation model and thus benefit from the 3D visualization of the relief. Additionally, semi-automatic image segmentation techniques are applied in order to reduce the amount of user interaction required and thus make the mapping process more efficient and convenient. The challenge in the adaption of the methods lies in the transfer of the algorithms to the quadtree representation of the data and in the application of out-of-core and hierarchical methods to ensure interactive performance. Although high-resolution remote sensing data are often available today, their effective resolution at steep slopes is rather low due to the oblique acquisition angle. For this reason, remote sensing data are suitable to only a limited extent for visualization as well as landform mapping purposes. To provide an easy way to supply additional imagery, an algorithm for registering uncalibrated photos to a textured digital elevation model is presented. A particular challenge in registering the images is posed by large variations in the photos concerning resolution, lighting conditions, seasonal changes, etc. The registered photos can be used to increase the visual quality of the textured DEM, in particular at steep slopes. To this end, a method is presented that combines several georegistered photos to textures for the DEM. The difficulty in this compositing process is to create a consistent appearance and avoid visible seams between the photos. In addition to that, the photos also provide valuable means to improve landform mapping. To this end, an extension of the landform mapping methods is presented that allows the utilization of the registered photos during mapping. This way, a detailed and exact mapping becomes feasible even at steep slopes

bonndoc – Der Publikationsserver der Universität Bonn

Segmentation mutuelle d'objets d'intérêt dans des séquences d'images stéréo multispectrales

Author: St-Charles Pierre-Luc
Publication venue
Publication date: 01/04/2018
Field of study

Les systèmes de vidéosurveillance automatisés actuellement déployés dans le monde sont encore bien loin de ceux qui sont représentés depuis des années dans les oeuvres de sciencefiction. Une des raisons derrière ce retard de développement est le manque d’outils de bas niveau permettant de traiter les données brutes captées sur le terrain. Le pré-traitement de ces données sert à réduire la quantité d’information qui transige vers des serveurs centralisés, qui eux effectuent l’interprétation complète du contenu visuel capté. L’identification d’objets d’intérêt dans les images brutes à partir de leur mouvement est un exemple de pré-traitement qui peut être réalisé. Toutefois, dans un contexte de vidéosurveillance, une méthode de pré-traitement ne peut généralement pas se fier à un modèle d’apparence ou de forme qui caractérise ces objets, car leur nature exacte n’est pas connue d’avance. Cela complique donc l’élaboration des méthodes de traitement de bas niveau. Dans cette thèse, nous présentons différentes méthodes permettant de détecter et de segmenter des objets d’intérêt à partir de séquences vidéo de manière complètement automatisée. Nous explorons d’abord les approches de segmentation vidéo monoculaire par soustraction d’arrière-plan. Ces approches se basent sur l’idée que l’arrière-plan d’une scène peut être modélisé au fil du temps, et que toute variation importante d’apparence non prédite par le modèle dévoile en fait la présence d’un objet en intrusion. Le principal défi devant être relevé par ce type de méthode est que leur modèle d’arrière-plan doit pouvoir s’adapter aux changements dynamiques des conditions d’observation de la scène. La méthode conçue doit aussi pouvoir rester sensible à l’apparition de nouveaux objets d’intérêt, malgré cette robustesse accrue aux comportements dynamiques prévisibles. Nous proposons deux méthodes introduisant différentes techniques de modélisation qui permettent de mieux caractériser l’apparence de l’arrière-plan sans que le modèle soit affecté par les changements d’illumination, et qui analysent la persistance locale de l’arrière-plan afin de mieux détecter les objets d’intérêt temporairement immobilisés. Nous introduisons aussi de nouveaux mécanismes de rétroaction servant à ajuster les hyperparamètres de nos méthodes en fonction du dynamisme observé de la scène et de la qualité des résultats produits.----------ABSTRACT: The automated video surveillance systems currently deployed around the world are still quite far in terms of capabilities from the ones that have inspired countless science fiction works over the past few years. One of the reasons behind this lag in development is the lack of lowlevel tools that allow raw image data to be processed directly in the field. This preprocessing is used to reduce the amount of information transferred to centralized servers that have to interpret the captured visual content for further use. The identification of objects of interest in raw images based on motion is an example of a reprocessing step that might be required by a large system. However, in a surveillance context, the preprocessing method can seldom rely on an appearance or shape model to recognize these objects since their exact nature cannot be known exactly in advance. This complicates the elaboration of low-level image processing methods. In this thesis, we present different methods that detect and segment objects of interest from video sequences in a fully unsupervised fashion. We first explore monocular video segmentation approaches based on background subtraction. These approaches are based on the idea that the background of an observed scene can be modeled over time, and that any drastic variation in appearance that is not predicted by the model actually reveals the presence of an intruding object. The main challenge that must be met by background subtraction methods is that their model should be able to adapt to dynamic changes in scene conditions. The designed methods must also remain sensitive to the emergence of new objects of interest despite this increased robustness to predictable dynamic scene behaviors. We propose two methods that introduce different modeling techniques to improve background appearance description in an illumination-invariant way, and that analyze local background persistence to improve the detection of temporarily stationary objects. We also introduce new feedback mechanisms used to adjust the hyperparameters of our methods based on the observed dynamics of the scene and the quality of the generated output

PolyPublie

Use of ERTS-1 data: Summary report of work on ten tasks

Author: Bryan M. L.
Horvath R.
Malila W. A.
Nalepka R. F.
Polcyn F. C.
Sattinger I. J.
Thomson F. J.
Vincent R. K.
Wezernak C. T.
Publication venue
Publication date
Field of study

The author has identified the following significant results. Depth mapping's for a portion of Lake Michigan and at the Little Bahama Bank test site have been verified by use of navigation charts and on-site visits. A thirteen category recognition map of Yellowstone Park has been prepared. Model calculation of atmospheric effects for various altitudes have been prepared. Radar, SLAR, and ERTS-1 data for flooded areas of Monroe County, Michigan are being studied. Water bodies can be reliably recognized and mapped using maximum likelihood processing of ERTS-1 digital data. Wetland mapping has been accomplished by slicing of single band and/or ratio processing of two bands for a single observation date. Both analog and digital processing have been used to map the Lake Ontario basin using ERTS-1 data. Operating characteristic curves were developed for the proportion estimation algorithm to determine its performance in the measurement of surface water area. The signal in band MSS-5 was related to sediment content of waters by modelling approach and by relating surface measurements of water to processed ERTS data. Radiance anomalies in ERTS-1 data could be associated with the presence of oil on water in San Francisco Bay, but the anomalies were of the same order as those caused by variations in sediment concentration and tidal flushing

NASA Technical Reports Server