93 research outputs found

    Capturing and Reconstructing the Appearance of Complex {3D} Scenes

    No full text
    In this thesis, we present our research on new acquisition methods for reflectance properties of real-world objects. Specifically, we first show a method for acquiring spatially varying densities in volumes of translucent, gaseous material with just a single image. This makes the method applicable to constantly changing phenomena like smoke without the use of high-speed camera equipment. Furthermore, we investigated how two well known techniques -- synthetic aperture confocal imaging and algorithmic descattering -- can be combined to help looking through a translucent medium like fog or murky water. We show that the depth at which we can still see an object embedded in the scattering medium is increased. In a related publication, we show how polarization and descattering based on phase-shifting can be combined for efficient 3D~scanning of translucent objects. Normally, subsurface scattering hinders the range estimation by offsetting the peak intensity beneath the surface away from the point of incidence. With our method, the subsurface scattering is reduced to a minimum and therefore reliable 3D~scanning is made possible. Finally, we present a system which recovers surface geometry, reflectance properties of opaque objects, and prevailing lighting conditions at the time of image capture from just a small number of input photographs. While there exist previous approaches to recover reflectance properties, our system is the first to work on images taken under almost arbitrary, changing lighting conditions. This enables us to use images we took from a community photo collection website

    The delta radiance field

    Get PDF
    The wide availability of mobile devices capable of computing high fidelity graphics in real-time has sparked a renewed interest in the development and research of Augmented Reality applications. Within the large spectrum of mixed real and virtual elements one specific area is dedicated to produce realistic augmentations with the aim of presenting virtual copies of real existing objects or soon to be produced products. Surprisingly though, the current state of this area leaves much to be desired: Augmenting objects in current systems are often presented without any reconstructed lighting whatsoever and therefore transfer an impression of being glued over a camera image rather than augmenting reality. In light of the advances in the movie industry, which has handled cases of mixed realities from one extreme end to another, it is a legitimate question to ask why such advances did not fully reflect onto Augmented Reality simulations as well. Generally understood to be real-time applications which reconstruct the spatial relation of real world elements and virtual objects, Augmented Reality has to deal with several uncertainties. Among them, unknown illumination and real scene conditions are the most important. Any kind of reconstruction of real world properties in an ad-hoc manner must likewise be incorporated into an algorithm responsible for shading virtual objects and transferring virtual light to real surfaces in an ad-hoc fashion. The immersiveness of an Augmented Reality simulation is, next to its realism and accuracy, primarily dependent on its responsiveness. Any computation affecting the final image must be computed in real-time. This condition rules out many of the methods used for movie production. The remaining real-time options face three problems: The shading of virtual surfaces under real natural illumination, the relighting of real surfaces according to the change in illumination due to the introduction of a new object into a scene, and the believable global interaction of real and virtual light. This dissertation presents contributions to answer the problems at hand. Current state-of-the-art methods build on Differential Rendering techniques to fuse global illumination algorithms into AR environments. This simple approach has a computationally costly downside, which limits the options for believable light transfer even further. This dissertation explores new shading and relighting algorithms built on a mathematical foundation replacing Differential Rendering. The result not only presents a more efficient competitor to the current state-of-the-art in global illumination relighting, but also advances the field with the ability to simulate effects which have not been demonstrated by contemporary publications until now

    Real-time Illumination and Visual Coherence for Photorealistic Augmented/Mixed Reality

    Get PDF
    A realistically inserted virtual object in the real-time physical environment is a desirable feature in augmented reality (AR) applications and mixed reality (MR) in general. This problem is considered a vital research area in computer graphics, a field that is experiencing ongoing discovery. The algorithms and methods used to obtain dynamic and real-time illumination measurement, estimating, and rendering of augmented reality scenes are utilized in many applications to achieve a realistic perception by humans. We cannot deny the powerful impact of the continuous development of computer vision and machine learning techniques accompanied by the original computer graphics and image processing methods to provide a significant range of novel AR/MR techniques. These techniques include methods for light source acquisition through image-based lighting or sampling, registering and estimating the lighting conditions, and composition of global illumination. In this review, we discussed the pipeline stages with the details elaborated about the methods and techniques that contributed to the development of providing a photo-realistic rendering, visual coherence, and interactive real-time illumination results in AR/MR

    Radiometric Scene Decomposition: Estimating Complex Re ectance and Natural Illumination from Images

    Get PDF
    The phrase, "a picture is worth a thousand words," is often used to emphasize the wealth of information encoded into an image. While much of this information (e.g., the identities of people in an image, the type and number of objects in an image, etc.) is readily inferred by humans, fully understanding an image is still extremely difficult for computers. One important set of information encoded into images are radiometric scene properties---the properties of a scene related to light. Each pixel in an image indicates the amount of light received by the camera after being reflected, transmitted, or emitted by objects in a scene. It follows that we can learn about the objects of the scene and the scene itself through the image by thinking about the interaction between light and geometry in a scene. The appearance of objects in an image is primarily due to three factors: the geometry of the scene, the reflectance of the surfaces, and the incident illumination of the scene. Recovering these hidden properties of scenes can give us a deep understanding of a scene. For example, the reflectance of a surface can give a hint at the material properties of that surface. In this thesis, we address the question of how to recover complex, spatially-varying reflectance functions and natural illumination in real scenes from one or more images with known or approximately-known geometry. Recovering latent radiometric properties from images is difficult because of the severe underdetermined nature of the problem (i.e., there are many potential combinations of reflectance, light, and geometry that would produce identical input images) combined with the overwhelming dimensionality of the problem. In the real world, reflectance functions are complex, requiring many parameters to accurately model. An important aspect of solving this problem is to create a compact mathematical model to express a wide range of surface reflectance. We must also carefully model scene illumination, which typically exhibits complex behavior as well. Prior work has often simply assumed the light incident to a scene is made up of one or more infinitely-distant point lights. This assumption, however, rarely holds up in practice as not only are scenes illuminated by every possible direction, they are also illuminated by other objects interreflecting one another. To accurately infer reflectance and illumination of real-world scenes, we must account for the real-world behavior of reflectance and illumination. In this work, we develop a mathematical framework for the inference of complex, spatially-varying reflectance and natural illumination in real-world scenes. We use a Bayesian approach, where the radiometric properties (i.e., reflectance and illumination) to be inferred are modeled as random variables. We can then apply statistical priors to model how reflectance and illumination often exist in the real world to help combat the ambiguities created through the image formation process. We use our framework to infer the reflectance and illumination in a variety of scenes, ultimately using it in unrestricted real-world scenes. We show that the framework is capable of recovering complex reflectance and natural illumination in the real world.Ph.D., Computer Science -- Drexel University, 201

    Automated inverse-rendering techniques for realistic 3D artefact compositing in 2D photographs

    Get PDF
    PhD ThesisThe process of acquiring images of a scene and modifying the defining structural features of the scene through the insertion of artefacts is known in literature as compositing. The process can take effect in the 2D domain (where the artefact originates from a 2D image and is inserted into a 2D image), or in the 3D domain (the artefact is defined as a dense 3D triangulated mesh, with textures describing its material properties). Compositing originated as a solution to enhancing, repairing, and more broadly editing photographs and video data alike in the film industry as part of the post-production stage. This is generally thought of as carrying out operations in a 2D domain (a single image with a known width, height, and colour data). The operations involved are sequential and entail separating the foreground from the background (matting), or identifying features from contour (feature matching and segmentation) with the purpose of introducing new data in the original. Since then, compositing techniques have gained more traction in the emerging fields of Mixed Reality (MR), Augmented Reality (AR), robotics and machine vision (scene understanding, scene reconstruction, autonomous navigation). When focusing on the 3D domain, compositing can be translated into a pipeline 1 - the incipient stage acquires the scene data, which then undergoes a number of processing steps aimed at inferring structural properties that ultimately allow for the placement of 3D artefacts anywhere within the scene, rendering a plausible and consistent result with regard to the physical properties of the initial input. This generic approach becomes challenging in the absence of user annotation and labelling of scene geometry, light sources and their respective magnitude and orientation, as well as a clear object segmentation and knowledge of surface properties. A single image, a stereo pair, or even a short image stream may not hold enough information regarding the shape or illumination of the scene, however, increasing the input data will only incur an extensive time penalty which is an established challenge in the field. Recent state-of-the-art methods address the difficulty of inference in the absence of 1In the present document, the term pipeline refers to a software solution formed of stand-alone modules or stages. It implies that the flow of execution runs in a single direction, and that each module has the potential to be used on its own as part of other solutions. Moreover, each module is assumed to take an input set and output data for the following stage, where each module addresses a single type of problem only. data, nonetheless, they do not attempt to solve the challenge of compositing artefacts between existing scene geometry, or cater for the inclusion of new geometry behind complex surface materials such as translucent glass or in front of reflective surfaces. The present work focuses on the compositing in the 3D domain and brings forth a software framework 2 that contributes solutions to a number of challenges encountered in the field, including the ability to render physically-accurate soft shadows in the absence of user annotate scene properties or RGB-D data. Another contribution consists in the timely manner in which the framework achieves a believable result compared to the other compositing methods which rely on offline rendering. The availability of proprietary hardware and user expertise are two of the main factors that are not required in order to achieve a fast and reliable results within the current framework

    Illumination Invariant Deep Learning for Hyperspectral Data

    Get PDF
    Motivated by the variability in hyperspectral images due to illumination and the difficulty in acquiring labelled data, this thesis proposes different approaches for learning illumination invariant feature representations and classification models for hyperspectral data captured outdoors, under natural sunlight. The approaches integrate domain knowledge into learning algorithms and hence does not rely on a priori knowledge of atmospheric parameters, additional sensors or large amounts of labelled training data. Hyperspectral sensors record rich semantic information from a scene, making them useful for robotics or remote sensing applications where perception systems are used to gain an understanding of the scene. Images recorded by hyperspectral sensors can, however, be affected to varying degrees by intrinsic factors relating to the sensor itself (keystone, smile, noise, particularly at the limits of the sensed spectral range) but also by extrinsic factors such as the way the scene is illuminated. The appearance of the scene in the image is tied to the incident illumination which is dependent on variables such as the position of the sun, geometry of the surface and the prevailing atmospheric conditions. Effects like shadows can make the appearance and spectral characteristics of identical materials to be significantly different. This degrades the performance of high-level algorithms that use hyperspectral data, such as those that do classification and clustering. If sufficient training data is available, learning algorithms such as neural networks can capture variability in the scene appearance and be trained to compensate for it. Learning algorithms are advantageous for this task because they do not require a priori knowledge of the prevailing atmospheric conditions or data from additional sensors. Labelling of hyperspectral data is, however, difficult and time-consuming, so acquiring enough labelled samples for the learning algorithm to adequately capture the scene appearance is challenging. Hence, there is a need for the development of techniques that are invariant to the effects of illumination that do not require large amounts of labelled data. In this thesis, an approach to learning a representation of hyperspectral data that is invariant to the effects of illumination is proposed. This approach combines a physics-based model of the illumination process with an unsupervised deep learning algorithm, and thus requires no labelled data. Datasets that vary both temporally and spatially are used to compare the proposed approach to other similar state-of-the-art techniques. The results show that the learnt representation is more invariant to shadows in the image and to variations in brightness due to changes in the scene topography or position of the sun in the sky. The results also show that a supervised classifier can predict class labels more accurately and more consistently across time when images are represented using the proposed method. Additionally, this thesis proposes methods to train supervised classification models to be more robust to variations in illumination where only limited amounts of labelled data are available. The transfer of knowledge from well-labelled datasets to poorly labelled datasets for classification is investigated. A method is also proposed for enabling small amounts of labelled samples to capture the variability in spectra across the scene. These samples are then used to train a classifier to be robust to the variability in the data caused by variations in illumination. The results show that these approaches make convolutional neural network classifiers more robust and achieve better performance when there is limited labelled training data. A case study is presented where a pipeline is proposed that incorporates the methods proposed in this thesis for learning robust feature representations and classification models. A scene is clustered using no labelled data. The results show that the pipeline groups the data into clusters that are consistent with the spatial distribution of the classes in the scene as determined from ground truth

    Programmable Image-Based Light Capture for Previsualization

    Get PDF
    Previsualization is a class of techniques for creating approximate previews of a movie sequence in order to visualize a scene prior to shooting it on the set. Often these techniques are used to convey the artistic direction of the story in terms of cinematic elements, such as camera movement, angle, lighting, dialogue, and character motion. Essentially, a movie director uses previsualization (previs) to convey movie visuals as he sees them in his minds-eye . Traditional methods for previs include hand-drawn sketches, Storyboards, scaled models, and photographs, which are created by artists to convey how a scene or character might look or move. A recent trend has been to use 3D graphics applications such as video game engines to perform previs, which is called 3D previs. This type of previs is generally used prior to shooting a scene in order to choreograph camera or character movements. To visualize a scene while being recorded on-set, directors and cinematographers use a technique called On-set previs, which provides a real-time view with little to no processing. Other types of previs, such as Technical previs, emphasize accurately capturing scene properties but lack any interactive manipulation and are usually employed by visual effects crews and not for cinematographers or directors. This dissertation\u27s focus is on creating a new method for interactive visualization that will automatically capture the on-set lighting and provide interactive manipulation of cinematic elements to facilitate the movie maker\u27s artistic expression, validate cinematic choices, and provide guidance to production crews. Our method will overcome the drawbacks of the all previous previs methods by combining photorealistic rendering with accurately captured scene details, which is interactively displayed on a mobile capture and rendering platform. This dissertation describes a new hardware and software previs framework that enables interactive visualization of on-set post-production elements. A three-tiered framework, which is the main contribution of this dissertation is; 1) a novel programmable camera architecture that provides programmability to low-level features and a visual programming interface, 2) new algorithms that analyzes and decomposes the scene photometrically, and 3) a previs interface that leverages the previous to perform interactive rendering and manipulation of the photometric and computer generated elements. For this dissertation we implemented a programmable camera with a novel visual programming interface. We developed the photometric theory and implementation of our novel relighting technique called Symmetric lighting, which can be used to relight a scene with multiple illuminants with respect to color, intensity and location on our programmable camera. We analyzed the performance of Symmetric lighting on synthetic and real scenes to evaluate the benefits and limitations with respect to the reflectance composition of the scene and the number and color of lights within the scene. We found that, since our method is based on a Lambertian reflectance assumption, our method works well under this assumption but that scenes with high amounts of specular reflections can have higher errors in terms of relighting accuracy and additional steps are required to mitigate this limitation. Also, scenes which contain lights whose colors are a too similar can lead to degenerate cases in terms of relighting. Despite these limitations, an important contribution of our work is that Symmetric lighting can also be leveraged as a solution for performing multi-illuminant white balancing and light color estimation within a scene with multiple illuminants without limits on the color range or number of lights. We compared our method to other white balance methods and show that our method is superior when at least one of the light colors is known a priori

    Survey on Controlable Image Synthesis with Deep Learning

    Full text link
    Image synthesis has attracted emerging research interests in academic and industry communities. Deep learning technologies especially the generative models greatly inspired controllable image synthesis approaches and applications, which aim to generate particular visual contents with latent prompts. In order to further investigate low-level controllable image synthesis problem which is crucial for fine image rendering and editing tasks, we present a survey of some recent works on 3D controllable image synthesis using deep learning. We first introduce the datasets and evaluation indicators for 3D controllable image synthesis. Then, we review the state-of-the-art research for geometrically controllable image synthesis in two aspects: 1) Viewpoint/pose-controllable image synthesis; 2) Structure/shape-controllable image synthesis. Furthermore, the photometrically controllable image synthesis approaches are also reviewed for 3D re-lighting researches. While the emphasis is on 3D controllable image synthesis algorithms, the related applications, products and resources are also briefly summarized for practitioners.Comment: 19 pages, 17 figure
    • …
    corecore