194 research outputs found

    Data-driven 3D Reconstruction and View Synthesis of Dynamic Scene Elements

    Get PDF
    Our world is filled with living beings and other dynamic elements. It is important to record dynamic things and events for the sake of education, archeology, and culture inheritance. From vintage to modern times, people have recorded dynamic scene elements in different ways, from sequences of cave paintings to frames of motion pictures. This thesis focuses on two key computer vision techniques by which dynamic element representation moves beyond video capture: towards 3D reconstruction and view synthesis. Although previous methods on these two aspects have been adopted to model and represent static scene elements, dynamic scene elements present unique and difficult challenges for the tasks. This thesis focuses on three types of dynamic scene elements, namely 1) dynamic texture with static shape, 2) dynamic shapes with static texture, and 3) dynamic illumination of static scenes. Two research aspects will be explored to represent and visualize them: dynamic 3D reconstruction and dynamic view synthesis. Dynamic 3D reconstruction aims to recover the 3D geometry of dynamic objects and, by modeling the objects’ movements, bring 3D reconstructions to life. Dynamic view synthesis, on the other hand, summarizes or predicts the dynamic appearance change of dynamic objects – for example, the daytime-to-nighttime illumination of a building or the future movements of a rigid body. We first target the problem of reconstructing dynamic textures of objects that have (approximately) fixed 3D shape but time-varying appearance. Examples of such objects include waterfalls, fountains, and electronic billboards. Since the appearance of dynamic-textured objects can be random and complicated, estimating the 3D geometry of these objects from 2D images/video requires novel tools beyond the appearance-based point correspondence methods of traditional 3D computer vision. To perform this 3D reconstruction, we introduce a method that simultaneously 1) segments dynamically textured scene objects in the input images and 2) reconstructs the 3D geometry of the entire scene, assuming a static 3D shape for the dynamically textured objects. Compared to dynamic textures, the appearance change of dynamic shapes is due to physically defined motions like rigid body movements. In these cases, assumptions can be made about the object’s motion constraints in order to identify corresponding points on the object at different timepoints. For example, two points on a rigid object have constant distance between them in the 3D space, no matter how the object moves. Based on this assumption of local rigidity, we propose a robust method to correctly identify point correspondences of two images viewing the same moving object from different viewpoints and at different times. Dense 3D geometry could be obtained from the computed point correspondences. We apply this method on unsynchronized video streams, and observe that the number of inlier correspondences found by this method can be used as indicator for frame alignment among the different streams. To model dynamic scene appearance caused by illumination changes, we propose a framework to find a sequence of images that have similar geometric composition as a single reference image and also show a smooth transition in illumination throughout the day. These images could be registered to visualize patterns of illumination change from a single viewpoint. The final topic of this thesis involves predicting the movements of dynamic shapes in the image domain. Towards this end, we propose deep neural network architectures to predict future views of dynamic motions, such as rigid body movements and flowers blooming. Instead of predicting image pixels from the network, my methods predict pixel offsets and iteratively synthesize future views.Doctor of Philosoph

    Art Inquiries Vol XVIII No 4 2023 Full Issue

    Get PDF

    Modelling appearance and geometry from images

    Get PDF
    Acquisition of realistic and relightable 3D models of large outdoor structures, such as buildings, requires the modelling of detailed geometry and visual appearance. Recovering these material characteristics can be very time consuming and needs specially dedicated equipment. Alternatively, surface detail can be conveyed by textures recovered from images, whose appearance is only valid under the originally photographed viewing and lighting conditions. Methods to easily capture locally detailed geometry, such as cracks in stone walls, and visual appearance require control of lighting conditions, which are usually restricted to small portions of surfaces captured at close range.This thesis investigates the acquisition of high-quality models from images, using simple photographic equipment and modest user intervention. The main focus of this investigation is on approximating detailed local depth information and visual appearance, obtained using a new image-based approach, and combining this with gross-scale 3D geometry. This is achieved by capturing these surface characteristics in small accessible regions and transferring them to the complete façade. This approach yields high-quality models, imparting the illusion of measured reflectance. In this thesis, we first present two novel algorithms for surface detail and visual appearance transfer, where these material properties are captured for small exemplars, using an image-based technique. Second, we develop an interactive solution to solve the problems of performing the transfer over both a large change in scale and to the different materials contained in a complete façade. Aiming to completely automate this process, a novel algorithm to differentiate between materials in the façade and associate them with the correct exemplars is introduced with promising results. Third, we present a new method for texture reconstruction from multiple images that optimises texture quality, by choosing the best view for every point and minimising seams. Material properties are transferred from the exemplars to the texture map, approximating reflectance and meso-structure. The combination of these techniques results in a complete working system capable of producing realistic relightable models of full building façades, containing high-resolution geometry and plausible visual appearance.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Deep Neural Network Architectures and Learning Methodologies for Classification and Application in 3D Reconstruction

    Get PDF
    In this work we explore two different scenarios of 3D reconstruction. The first, urban scenes, is approached using a deep learning network trained to identify structurally important classes within aerial imagery of cities. The network was trained using data taken from ISPRS benchmark dataset of the city of Vaihingen. Using the segmented maps generated by the network we can proceed to more accurately reconstruct the scenes by a process of clustering and then class specific model generation. The second scenario is that of underwater scenes. We use two separate networks to first identify caustics and then remove them from a scene. Data was generated synthetically as real world datasets for this subject are extremely hard to produce. Using the generated caustic free image we can then reconstruct the scene with more precision and accuracy through a process of structure from motion. We investigate different deep learning architectures and parameters for both scenarios. Our results are evaluated to be efficient and effective by comparing them with online benchmarks and alternative reconstruction attempts. We conclude by discussing the limitations of problem specific datasets and our potential research into the generation of datasets through the use of Generative-Adverserial-Networks

    Efficient, image-based appearance acquisition of real-world objects

    Get PDF
    Two ingredients are necessary to synthesize realistic images: an accurate rendering algorithm and, equally important, high-quality models in terms of geometry and reflection properties. In this dissertation we focus on capturing the appearance of real world objects. The acquired model must represent both the geometry and the reflection properties of the object in order to create new views of the object with novel illumination. Starting from scanned 3D geometry, we measure the reflection properties (BRDF) of the object from images taken under known viewing and lighting conditions. The BRDF measurement require only a small number of input images and is made even more efficient by a view planning algorithm. In particular, we propose algorithms for efficient image-to-geometry registration, and an image-based measurement technique to reconstruct spatially varying materials from a sparse set of images using a point light source. Moreover, we present a view planning algorithm that calculates camera and light source positions for optimal quality and efficiency of the measurement process. Relightable models of real-world objects are requested in various fields such as movie production, e-commerce, digital libraries, and virtual heritage.Zur Synthetisierung realistischer Bilder ist zweierlei nötig: ein akkurates Verfahren zur Beleuchtungsberechnung und, ebenso wichtig, qualitativ hochwertige Modelle, die Geometrie und Reflexionseigenschaften der Szene repräsentieren. Die Aufnahme des Erscheinungbildes realer Gegenstände steht im Mittelpunkt dieser Dissertation. Um beliebige Ansichten eines Gegenstandes unter neuer Beleuchtung zu berechnen, müssen die aufgenommenen Modelle sowohl die Geometrie als auch die Reflexionseigenschaften beinhalten. Ausgehend von einem eingescannten 3D-Geometriemodell, werden die Reflexionseigenschaften (BRDF) anhand von Bildern des Objekts gemessen, die unter kontrollierten Lichtverhältnissen aus verschiedenen Perspektiven aufgenommen wurden. Für die Messungen der BRDF sind nur wenige Eingabebilder erforderlich. Im Speziellen werden Methoden vorgestellt für die Registrierung von Bildern und Geometrie sowie für die bildbasierte Messung von variierenden Materialien. Zur zusätzlichen Steigerung der Effizienz der Aufnahme wie der Qualität des Modells, wurde ein Planungsalgorithmus entwickelt, der optimale Kamera- und Lichtquellenpositionen berechnet. Anwendung finden virtuelle 3D-Modelle bespielsweise in der Filmproduktion, im E-Commerce, in digitalen Bibliotheken wie auch bei der Bewahrung von kulturhistorischem Erbe

    Synthesizing and Editing Photo-realistic Visual Objects

    Get PDF
    In this thesis we investigate novel methods of synthesizing new images of a deformable visual object using a collection of images of the object. We investigate both parametric and non-parametric methods as well as a combination of the two methods for the problem of image synthesis. Our main focus are complex visual objects, specifically deformable objects and objects with varying numbers of visible parts. We first introduce sketch-driven image synthesis system, which allows the user to draw ellipses and outlines in order to sketch a rough shape of animals as a constraint to the synthesized image. This system interactively provides feedback in the form of ellipse and contour suggestions to the partial sketch of the user. The user's sketch guides the non-parametric synthesis algorithm that blends patches from two exemplar images in a coarse-to-fine fashion to create a final image. We evaluate the method and synthesized images through two user studies. Instead of non-parametric blending of patches, a parametric model of the appearance is more desirable as its appearance representation is shared between all images of the dataset. Hence, we propose Context-Conditioned Component Analysis, a probabilistic generative parametric model, which described images with a linear combination of basis functions. The basis functions are evaluated for each pixel using a context vector computed from the local shape information. We evaluate C-CCA qualitatively and quantitatively on inpainting, appearance transfer and reconstruction tasks. Drawing samples of C-CCA generates novel, globally-coherent images, which, unfortunately, lack high-frequency details due to dimensionality reduction and misalignment. We develop a non-parametric model that enhances the samples of C-CCA with locally-coherent, high-frequency details. The non-parametric model efficiently finds patches from the dataset that match the C-CCA sample and blends the patches together. We analyze the results of the combined method on the datasets of horse and elephant images

    Human-Centered Content-Based Image Retrieval

    Get PDF
    Retrieval of images that lack a (suitable) annotations cannot be achieved through (traditional) Information Retrieval (IR) techniques. Access through such collections can be achieved through the application of computer vision techniques on the IR problem, which is baptized Content-Based Image Retrieval (CBIR). In contrast with most purely technological approaches, the thesis Human-Centered Content-Based Image Retrieval approaches the problem from a human/user centered perspective. Psychophysical experiments were conducted in which people were asked to categorize colors. The data gathered from these experiments was fed to a Fast Exact Euclidean Distance (FEED) transform (Schouten & Van den Broek, 2004), which enabled the segmentation of color space based on human perception (Van den Broek et al., 2008). This unique color space segementation was exploited for texture analysis and image segmentation, and subsequently for full-featured CBIR. In addition, a unique CBIR-benchmark was developed (Van den Broek et al., 2004, 2005). This benchmark was used to explore what and how several parameters (e.g., color and distance measures) of the CBIR process influence retrieval results. In contrast with other research, users judgements were assigned as metric. The online IR and CBIR system Multimedia for Art Retrieval (M4ART) (URL: http://www.m4art.org) has been (partly) founded on the techniques discussed in this thesis. References: - Broek, E.L. van den, Kisters, P.M.F., and Vuurpijl, L.G. (2004). The utilization of human color categorization for content-based image retrieval. Proceedings of SPIE (Human Vision and Electronic Imaging), 5292, 351-362. [see also Chapter 7] - Broek, E.L. van den, Kisters, P.M.F., and Vuurpijl, L.G. (2005). Content-Based Image Retrieval Benchmarking: Utilizing Color Categories and Color Distributions. Journal of Imaging Science and Technology, 49(3), 293-301. [see also Chapter 8] - Broek, E.L. van den, Schouten, Th.E., and Kisters, P.M.F. (2008). Modeling Human Color Categorization. Pattern Recognition Letters, 29(8), 1136-1144. [see also Chapter 5] - Schouten, Th.E. and Broek, E.L. van den (2004). Fast Exact Euclidean Distance (FEED) transformation. In J. Kittler, M. Petrou, and M. Nixon (Eds.), Proceedings of the 17th IEEE International Conference on Pattern Recognition (ICPR 2004), Vol 3, p. 594-597. August 23-26, Cambridge - United Kingdom. [see also Appendix C

    Planetary Science Informatics and Data Analytics Conference : April 24–26, 2018, St. Louis, Missouri

    Get PDF
    The PSIDA conference provides a forum to discuss approaches, challenges, and applications of informatics and data analytics technologies and capabilities in planetary science.Institutional Support NASA Planetary Data System Geosciences, Lunar and Planetary Institute.Chairs Tom Stein, Washington University, St. Louis, USA, Dan Crichton, Jet Propulsion Laboratory, Pasadena, USA ; Program Committee Alphan Altinok, Jet Propulsion Laboratory, Pasadena, USA … [and 8 others]PARTIAL CONTENTS: ESA Planetary Science Archive Architecture and Data Management--SPICE for ESA Planetary Missions--VESPA: Enlarging the Virtual Observatory to Planetary Science--SeaBIRD: A Flexible and Intuitive Planetary Datamining Infrastructure--Model-Driven Development for PDS4 Software and Services--The Need for a Planetary Spatial Data Clearinghouse--The Relationship Between Planetary Spatial Data Infrastructure and the Planetary Data System--Update on the NASA-USGS Planetary Spatial Data Infrastructure Inter-Agency Agreement--MoonDB - A Data System for Analytical Data of Lunar Samples--Large-Scale Numerical Simulations of Planetary Interiors--Scalable Data Processing with the LROC Processing Pipelines--PACKMAN-Net: A Distributed, Open-Access, and Scalable Network of User-Friendly Space Weather Stations

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF
    • …
    corecore