100 research outputs found

    Surface Modeling and Analysis Using Range Images: Smoothing, Registration, Integration, and Segmentation

    Get PDF
    This dissertation presents a framework for 3D reconstruction and scene analysis, using a set of range images. The motivation for developing this framework came from the needs to reconstruct the surfaces of small mechanical parts in reverse engineering tasks, build a virtual environment of indoor and outdoor scenes, and understand 3D images. The input of the framework is a set of range images of an object or a scene captured by range scanners. The output is a triangulated surface that can be segmented into meaningful parts. A textured surface can be reconstructed if color images are provided. The framework consists of surface smoothing, registration, integration, and segmentation. Surface smoothing eliminates the noise present in raw measurements from range scanners. This research proposes area-decreasing flow that is theoretically identical to the mean curvature flow. Using area-decreasing flow, there is no need to estimate the curvature value and an optimal step size of the flow can be obtained. Crease edges and sharp corners are preserved by an adaptive scheme. Surface registration aligns measurements from different viewpoints in a common coordinate system. This research proposes a new surface representation scheme named point fingerprint. Surfaces are registered by finding corresponding point pairs in an overlapping region based on fingerprint comparison. Surface integration merges registered surface patches into a whole surface. This research employs an implicit surface-based integration technique. The proposed algorithm can generate watertight models by space carving or filling the holes based on volumetric interpolation. Textures from different views are integrated inside a volumetric grid. Surface segmentation is useful to decompose CAD models in reverse engineering tasks and help object recognition in a 3D scene. This research proposes a watershed-based surface mesh segmentation approach. The new algorithm accurately segments the plateaus by geodesic erosion using fast marching method. The performance of the framework is presented using both synthetic and real world data from different range scanners. The dissertation concludes by summarizing the development of the framework and then suggests future research topics

    Construction de mosaïques de super-résolution à partir de la vidéo de basse résolution. Application au résumé vidéo et la dissimulation d'erreurs de transmission.

    Get PDF
    La numérisation des vidéos existantes ainsi que le développement explosif des services multimédia par des réseaux comme la diffusion de la télévision numérique ou les communications mobiles ont produit une énorme quantité de vidéos compressées. Ceci nécessite des outils d’indexation et de navigation efficaces, mais une indexation avant l’encodage n’est pas habituelle. L’approche courante est le décodage complet des ces vidéos pour ensuite créer des indexes. Ceci est très coûteux et par conséquent non réalisable en temps réel. De plus, des informations importantes comme le mouvement, perdus lors du décodage, sont reestimées bien que déjà présentes dans le flux comprimé. Notre but dans cette thèse est donc la réutilisation des données déjà présents dans le flux comprimé MPEG pour l’indexation et la navigation rapide. Plus précisément, nous extrayons des coefficients DC et des vecteurs de mouvement. Dans le cadre de cette thèse, nous nous sommes en particulier intéressés à la construction de mosaïques à partir des images DC extraites des images I. Une mosaïque est construite par recalage et fusion de toutes les images d’une séquence vidéo dans un seul système de coordonnées. Ce dernier est en général aligné avec une des images de la séquence : l’image de référence. Il en résulte une seule image qui donne une vue globale de la séquence. Ainsi, nous proposons dans cette thèse un système complet pour la construction des mosaïques à partir du flux MPEG-1/2 qui tient compte de différentes problèmes apparaissant dans des séquences vidéo réeles, comme par exemple des objets en mouvment ou des changements d’éclairage. Une tâche essentielle pour la construction d’une mosaïque est l’estimation de mouvement entre chaque image de la séquence et l’image de référence. Notre méthode se base sur une estimation robuste du mouvement global de la caméra à partir des vecteurs de mouvement des images P. Cependant, le mouvement global de la caméra estimé pour une image P peut être incorrect car il dépend fortement de la précision des vecteurs encodés. Nous détectons les images P concernées en tenant compte des coefficients DC de l’erreur encodée associée et proposons deux méthodes pour corriger ces mouvements. Unemosaïque construite à partir des images DC a une résolution très faible et souffre des effets d’aliasing dus à la nature des images DC. Afin d’augmenter sa résolution et d’améliorer sa qualité visuelle, nous appliquons une méthode de super-résolution basée sur des rétro-projections itératives. Les méthodes de super-résolution sont également basées sur le recalage et la fusion des images d’une séquence vidéo, mais sont accompagnées d’une restauration d’image. Dans ce cadre, nous avons développé une nouvelleméthode d’estimation de flou dû au mouvement de la caméra ainsi qu’une méthode correspondante de restauration spectrale. La restauration spectrale permet de traiter le flou globalement, mais, dans le cas des obvi jets ayant un mouvement indépendant du mouvement de la caméra, des flous locaux apparaissent. C’est pourquoi, nous proposons un nouvel algorithme de super-résolution dérivé de la restauration spatiale itérative de Van Cittert et Jansson permettant de restaurer des flous locaux. En nous basant sur une segmentation d’objets en mouvement, nous restaurons séparément lamosaïque d’arrière-plan et les objets de l’avant-plan. Nous avons adapté notre méthode d’estimation de flou en conséquence. Dans une premier temps, nous avons appliqué notre méthode à la construction de résumé vidéo avec pour l’objectif la navigation rapide par mosaïques dans la vidéo compressée. Puis, nous établissions comment la réutilisation des résultats intermédiaires sert à d’autres tâches d’indexation, notamment à la détection de changement de plan pour les images I et à la caractérisation dumouvement de la caméra. Enfin, nous avons exploré le domaine de la récupération des erreurs de transmission. Notre approche consiste en construire une mosaïque lors du décodage d’un plan ; en cas de perte de données, l’information manquante peut être dissimulée grace à cette mosaïque

    Inverse Problems and Self-similarity in Imaging

    Get PDF
    This thesis examines the concept of image self-similarity and provides solutions to various associated inverse problems such as resolution enhancement and missing fractal codes. In general, many real-world inverse problems are ill-posed, mainly because of the lack of existence of a unique solution. The procedure of providing acceptable unique solutions to such problems is known as regularization. The concept of image prior, which has been of crucial importance in image modelling and processing, has also been important in solving inverse problems since it algebraically translates to the regularization procedure. Indeed, much recent progress in imaging has been due to advances in the formulation and practice of regularization. This, coupled with progress in optimization and numerical analysis, has yielded much improvement in computational methods of solving inverse imaging problems. Historically, the idea of self-similarity was important in the development of fractal image coding. Here we show that the self-similarity properties of natural images may be used to construct image priors for the purpose of addressing certain inverse problems. Indeed, new trends in the area of non-local image processing have provided a rejuvenated appreciation of image self-similarity and opportunities to explore novel self-similarity-based priors. We first revisit the concept of fractal-based methods and address some open theoretical problems in the area. This includes formulating a necessary and sufficient condition for the contractivity of the block fractal transform operator. We shall also provide some more generalized formulations of fractal-based self-similarity constraints of an image. These formulations can be developed algebraically and also in terms of the set-based method of Projection Onto Convex Sets (POCS). We then revisit the traditional inverse problems of single frame image zooming and multi-frame resolution enhancement, also known as super-resolution. Some ideas will be borrowed from newly developed non-local denoising algorithms in order to formulate self-similarity priors. Understanding the role of scale and choice of examples/samples is also important in these proposed models. For this purpose, we perform an extensive series of numerical experiments and analyze the results. These ideas naturally lead to the method of self-examples, which relies on the regularity properties of natural images at different scales, as a means of solving the single-frame image zooming problem. Furthermore, we propose and investigate a multi-frame super-resolution counterpart which does not require explicit motion estimation among video sequences

    Deep Structured Layers for Instance-Level Optimization in 2D and 3D Vision

    Get PDF
    The approach we present in this thesis is that of integrating optimization problems as layers in deep neural networks. Optimization-based modeling provides an additional set of tools enabling the design of powerful neural networks for a wide battery of computer vision tasks. This thesis shows formulations and experiments for vision tasks ranging from image reconstruction to 3D reconstruction. We first propose an unrolled optimization method with implicit regularization properties for reconstructing images from noisy camera readings. The method resembles an unrolled majorization minimization framework with convolutional neural networks acting as regularizers. We report state-of-the-art performance in image reconstruction on both noisy and noise-free evaluation setups across many datasets. We further focus on the task of monocular 3D reconstruction of articulated objects using video self-supervision. The proposed method uses a structured layer for accurate object deformation that controls a 3D surface by displacing a small number of learnable handles. While relying on a small set of training data per category for self-supervision, the method obtains state-of-the-art reconstruction accuracy with diverse shapes and viewpoints for multiple articulated objects. We finally address the shortcomings of the previous method that revolve around regressing the camera pose using multiple hypotheses. We propose a method that recovers a 3D shape from a 2D image by relying solely on 3D-2D correspondences regressed from a convolutional neural network. These correspondences are used in conjunction with an optimization problem to estimate per sample the camera pose and deformation. We quantitatively show the effectiveness of the proposed method on self-supervised 3D reconstruction on multiple categories without the need for multiple hypotheses

    Skeletonization methods for image and volume inpainting

    Get PDF

    Joint methods in imaging based on diffuse image representations

    Get PDF
    This thesis deals with the application and the analysis of different variants of the Mumford-Shah model in the context of image processing. In this kind of models, a given function is approximated in a piecewise smooth or piecewise constant manner. Especially the numerical treatment of the discontinuities requires additional models that are also outlined in this work. The main part of this thesis is concerned with four different topics. Simultaneous edge detection and registration of two images: The image edges are detected with the Ambrosio-Tortorelli model, an approximation of the Mumford-Shah model that approximates the discontinuity set with a phase field, and the registration is based on these edges. The registration obtained by this model is fully symmetric in the sense that the same matching is obtained if the roles of the two input images are swapped. Detection of grain boundaries from atomic scale images of metals or metal alloys: This is an image processing problem from materials science where atomic scale images are obtained either experimentally for instance by transmission electron microscopy or by numerical simulation tools. Grains are homogenous material regions whose atomic lattice orientation differs from their surroundings. Based on a Mumford-Shah type functional, the grain boundaries are modeled as the discontinuity set of the lattice orientation. In addition to the grain boundaries, the model incorporates the extraction of a global elastic deformation of the atomic lattice. Numerically, the discontinuity set is modeled by a level set function following the approach by Chan and Vese. Joint motion estimation and restoration of motion-blurred video: A variational model for joint object detection, motion estimation and deblurring of consecutive video frames is proposed. For this purpose, a new motion blur model is developed that accurately describes the blur also close to the boundary of a moving object. Here, the video is assumed to consist of an object moving in front of a static background. The segmentation into object and background is handled by a Mumford-Shah type aspect of the proposed model. Convexification of the binary Mumford-Shah segmentation model: After considering the application of Mumford-Shah type models to tackle specific image processing problems in the previous topics, the Mumford-Shah model itself is studied more closely. Inspired by the work of Nikolova, Esedoglu and Chan, a method is developed that allows global minimization of the binary Mumford-Shah segmentation model by solving a convex, unconstrained optimization problem. In an outlook, segmentation of flowfields into piecewise affine regions using this convexification method is briefly discussed

    Skeletonization methods for image and volume inpainting

    Get PDF

    Reduction of blocking artifacts using side information

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 95-96).Block-based image and video coding systems are used extensively in practice. In low bit-rate applications, however, they suffer from annoying discontinuities, called blocking artifacts. Prior research shows that incorporating systems that reduce blocking artifacts into codecs is useful because visual quality is improved. Existing methods reduce blocking artifacts by applying various post-processing techniques to the compressed image. Such methods require neither any modification to current encoders nor an increase in the bit-rate. This thesis examines a framework where blocking artifacts are reduced using side information transmitted from the encoder to the decoder. Using side information enables the use of the original image in deblocking, which improves performance. Furthermore, the computational burden at the decoder is reduced. The principal question that arises is whether the gains in performance of this choice can compensate for the increase in the bit-rate due to the transmission of side information. Experiments are carried out to answer this question with the following sample system: The encoder determines block boundaries that exhibit blocking artifacts as well as filters (from a predefined set of filters) that best deblock these block boundaries.(cont.) Then it transmits side information that conveys the determined block boundaries together with their selected filters to the decoder. The decoder uses the received side information to perform deblocking. The proposed sample system is compared against an ordinary coding system and a post-processing type deblocking system with the bit-rate of these systems being equal to the overall bit-rate (regular encoding bits + side information bits) of the proposed system. The results of the comparisons indicate that, both for images and video sequences, the proposed system can perform better in terms of both visual quality and PSNR for some range of coding bit-rates.by Fatih Kamisli.S.M
    • …
    corecore