141 research outputs found

    Acquisition, Modeling, and Augmentation of Reflectance for Synthetic Optical Flow Reference Data

    Get PDF
    This thesis is concerned with the acquisition, modeling, and augmentation of material reflectance to simulate high-fidelity synthetic data for computer vision tasks. The topic is covered in three chapters: I commence with exploring the upper limits of reflectance acquisition. I analyze state-of-the-art BTF reflectance field renderings and show that they can be applied to optical flow performance analysis with closely matching performance to real-world images. Next, I present two methods for fitting efficient BRDF reflectance models to measured BTF data. Both methods combined retain all relevant reflectance information as well as the surface normal details on a pixel level. I further show that the resulting synthesized images are suited for optical flow performance analysis, with a virtually identical performance for all material types. Finally, I present a novel method for augmenting real-world datasets with physically plausible precipitation effects, including ground surface wetting, water droplets on the windshield, and water spray and mists. This is achieved by projecting the realworld image data onto a reconstructed virtual scene, manipulating the scene and the surface reflectance, and performing unbiased light transport simulation of the precipitation effects

    Orientation Analysis in 4D Light Fields

    Get PDF
    This work is about the analysis of 4D light fields. In the context of this work a light field is a series of 2D digital images of a scene captured on a planar regular grid of camera positions. It is essential that the scene is captured over several camera positions having constant distances to each other. This results in a sampling of light rays emitted by a single scene point as a function of the camera position. In contrast to traditional images – measuring the light intensity in the spatial domain – this approach additionally captures directional information leading to the four dimensionality mentioned above. For image processing, light fields are a relatively new research area. In computer graphics, they were used to avoid the work-intensive modeling of 3D geometry by instead using view interpolation to achieve interactive 3D experiences without explicit geometry. The intention of this work is vice versa, namely using light fields to reconstruct geometry of a captured scene. The reason is that light fields provide much richer information content compared to existing approaches of 3D reconstruction. Due to the regular and dense sampling of the scene, aside from geometry, material properties are also imaged. Surfaces whose visual appearance change when changing the line of sight causes problems for known approaches of passive 3D reconstruction. Light fields instead sample this change in appearance and thus make analysis possible. This thesis covers different contributions. We propose a new approach to convert raw data from a light field camera (plenoptic camera 2.0) to a 4D representation without a pre-computation of pixel-wise depth. This special representation – also called the Lumigraph – enables an access to epipolar planes which are sub-spaces of the 4D data structure. An approach is proposed analyzing these epipolar plane images to achieve a robust depth estimation on Lambertian surfaces. Based on this, an extension is presented also handling reflective and transparent surfaces. As examples for the usefulness of this inherently available depth information we show improvements to well known techniques like super-resolution and object segmentation when extending them to light fields. Additionally a benchmark database was established over time during the research for this thesis. We will test the proposed approaches using this database and hope that it helps to drive future research in this field

    Development of a calibration pipeline for a monocular-view structured illumination 3D sensor utilizing an array projector

    Get PDF
    Commercial off-the-shelf digital projection systems are commonly used in active structured illumination photogrammetry of macro-scale surfaces due to their relatively low cost, accessibility, and ease of use. They can be described as inverse pinhole modelled. The calibration pipeline of a 3D sensor utilizing pinhole devices in a projector-camera setup configuration is already well-established. Recently, there have been advances in creating projection systems offering projection speeds greater than that available from conventional off-the-shelf digital projectors. However, they cannot be calibrated using well established techniques based on the pinole assumption. They are chip-less and without projection lens. This work is based on the utilization of unconventional projection systems known as array projectors which contain not one but multiple projection channels that project a temporal sequence of illumination patterns. None of the channels implement a digital projection chip or a projection lens. To workaround the calibration problem, previous realizations of a 3D sensor based on an array projector required a stereo-camera setup. Triangulation took place between the two pinhole modelled cameras instead. However, a monocular setup is desired as a single camera configuration results in decreased cost, weight, and form-factor. This study presents a novel calibration pipeline that realizes a single camera setup. A generalized intrinsic calibration process without model assumptions was developed that directly samples the illumination frustum of each array projection channel. An extrinsic calibration process was then created that determines the pose of the single camera through a downhill simplex optimization initialized by particle swarm. Lastly, a method to store the intrinsic calibration with the aid of an easily realizable calibration jig was developed for re-use in arbitrary measurement camera positions so that intrinsic calibration does not have to be repeated

    NaRPA: Navigation and Rendering Pipeline for Astronautics

    Full text link
    This paper presents Navigation and Rendering Pipeline for Astronautics (NaRPA) - a novel ray-tracing-based computer graphics engine to model and simulate light transport for space-borne imaging. NaRPA incorporates lighting models with attention to atmospheric and shading effects for the synthesis of space-to-space and ground-to-space virtual observations. In addition to image rendering, the engine also possesses point cloud, depth, and contour map generation capabilities to simulate passive and active vision-based sensors and to facilitate the designing, testing, or verification of visual navigation algorithms. Physically based rendering capabilities of NaRPA and the efficacy of the proposed rendering algorithm are demonstrated using applications in representative space-based environments. A key demonstration includes NaRPA as a tool for generating stereo imagery and application in 3D coordinate estimation using triangulation. Another prominent application of NaRPA includes a novel differentiable rendering approach for image-based attitude estimation is proposed to highlight the efficacy of the NaRPA engine for simulating vision-based navigation and guidance operations.Comment: 49 pages, 22 figure

    Automated inverse-rendering techniques for realistic 3D artefact compositing in 2D photographs

    Get PDF
    PhD ThesisThe process of acquiring images of a scene and modifying the defining structural features of the scene through the insertion of artefacts is known in literature as compositing. The process can take effect in the 2D domain (where the artefact originates from a 2D image and is inserted into a 2D image), or in the 3D domain (the artefact is defined as a dense 3D triangulated mesh, with textures describing its material properties). Compositing originated as a solution to enhancing, repairing, and more broadly editing photographs and video data alike in the film industry as part of the post-production stage. This is generally thought of as carrying out operations in a 2D domain (a single image with a known width, height, and colour data). The operations involved are sequential and entail separating the foreground from the background (matting), or identifying features from contour (feature matching and segmentation) with the purpose of introducing new data in the original. Since then, compositing techniques have gained more traction in the emerging fields of Mixed Reality (MR), Augmented Reality (AR), robotics and machine vision (scene understanding, scene reconstruction, autonomous navigation). When focusing on the 3D domain, compositing can be translated into a pipeline 1 - the incipient stage acquires the scene data, which then undergoes a number of processing steps aimed at inferring structural properties that ultimately allow for the placement of 3D artefacts anywhere within the scene, rendering a plausible and consistent result with regard to the physical properties of the initial input. This generic approach becomes challenging in the absence of user annotation and labelling of scene geometry, light sources and their respective magnitude and orientation, as well as a clear object segmentation and knowledge of surface properties. A single image, a stereo pair, or even a short image stream may not hold enough information regarding the shape or illumination of the scene, however, increasing the input data will only incur an extensive time penalty which is an established challenge in the field. Recent state-of-the-art methods address the difficulty of inference in the absence of 1In the present document, the term pipeline refers to a software solution formed of stand-alone modules or stages. It implies that the flow of execution runs in a single direction, and that each module has the potential to be used on its own as part of other solutions. Moreover, each module is assumed to take an input set and output data for the following stage, where each module addresses a single type of problem only. data, nonetheless, they do not attempt to solve the challenge of compositing artefacts between existing scene geometry, or cater for the inclusion of new geometry behind complex surface materials such as translucent glass or in front of reflective surfaces. The present work focuses on the compositing in the 3D domain and brings forth a software framework 2 that contributes solutions to a number of challenges encountered in the field, including the ability to render physically-accurate soft shadows in the absence of user annotate scene properties or RGB-D data. Another contribution consists in the timely manner in which the framework achieves a believable result compared to the other compositing methods which rely on offline rendering. The availability of proprietary hardware and user expertise are two of the main factors that are not required in order to achieve a fast and reliable results within the current framework

    Phenomenological modeling of image irradiance for non-Lambertian surfaces under natural illumination.

    Get PDF
    Various vision tasks are usually confronted by appearance variations due to changes of illumination. For instance, in a recognition system, it has been shown that the variability in human face appearance is owed to changes to lighting conditions rather than person\u27s identity. Theoretically, due to the arbitrariness of the lighting function, the space of all possible images of a fixed-pose object under all possible illumination conditions is infinite dimensional. Nonetheless, it has been proven that the set of images of a convex Lambertian surface under distant illumination lies near a low dimensional linear subspace. This result was also extended to include non-Lambertian objects with non-convex geometry. As such, vision applications, concerned with the recovery of illumination, reflectance or surface geometry from images, would benefit from a low-dimensional generative model which captures appearance variations w.r.t. illumination conditions and surface reflectance properties. This enables the formulation of such inverse problems as parameter estimation. Typically, subspace construction boils to performing a dimensionality reduction scheme, e.g. Principal Component Analysis (PCA), on a large set of (real/synthesized) images of object(s) of interest with fixed pose but different illumination conditions. However, this approach has two major problems. First, the acquired/rendered image ensemble should be statistically significant vis-a-vis capturing the full behavior of the sources of variations that is of interest, in particular illumination and reflectance. Second, the curse of dimensionality hinders numerical methods such as Singular Value Decomposition (SVD) which becomes intractable especially with large number of large-sized realizations in the image ensemble. One way to bypass the need of large image ensemble is to construct appearance subspaces using phenomenological models which capture appearance variations through mathematical abstraction of the reflection process. In particular, the harmonic expansion of the image irradiance equation can be used to derive an analytic subspace to represent images under fixed pose but different illumination conditions where the image irradiance equation has been formulated in a convolution framework. Due to their low-frequency nature, irradiance signals can be represented using low-order basis functions, where Spherical Harmonics (SH) has been extensively adopted. Typically, an ideal solution to the image irradiance (appearance) modeling problem should be able to incorporate complex illumination, cast shadows as well as realistic surface reflectance properties, while moving away from the simplifying assumptions of Lambertian reflectance and single-source distant illumination. By handling arbitrary complex illumination and non-Lambertian reflectance, the appearance model proposed in this dissertation moves the state of the art closer to the ideal solution. This work primarily addresses the geometrical compliance of the hemispherical basis for representing surface reflectance while presenting a compact, yet accurate representation for arbitrary materials. To maintain the plausibility of the resulting appearance, the proposed basis is constructed in a manner that satisfies the Helmholtz reciprocity property while avoiding high computational complexity. It is believed that having the illumination and surface reflectance represented in the spherical and hemispherical domains respectively, while complying with the physical properties of the surface reflectance would provide better approximation accuracy of image irradiance when compared to the representation in the spherical domain. Discounting subsurface scattering and surface emittance, this work proposes a surface reflectance basis, based on hemispherical harmonics (HSH), defined on the Cartesian product of the incoming and outgoing local hemispheres (i.e. w.r.t. surface points). This basis obeys physical properties of surface reflectance involving reciprocity and energy conservation. The basis functions are validated using analytical reflectance models as well as scattered reflectance measurements which might violate the Helmholtz reciprocity property (this can be filtered out through the process of projecting them on the subspace spanned by the proposed basis, where the reciprocity property is preserved in the least-squares sense). The image formation process of isotropic surfaces under arbitrary distant illumination is also formulated in the frequency space where the orthogonality relation between illumination and reflectance bases is encoded in what is termed as irradiance harmonics. Such harmonics decouple the effect of illumination and reflectance from the underlying pose and geometry. Further, a bilinear approach to analytically construct irradiance subspace is proposed in order to tackle the inherent problem of small-sample-size and curse of dimensionality. The process of finding the analytic subspace is posed as establishing a relation between its principal components and that of the irradiance harmonics basis functions. It is also shown how to incorporate prior information about natural illumination and real-world surface reflectance characteristics in order to capture the full behavior of complex illumination and non-Lambertian reflectance. The use of the presented theoretical framework to develop practical algorithms for shape recovery is further presented where the hitherto assumed Lambertian assumption is relaxed. With a single image of unknown general illumination, the underlying geometrical structure can be recovered while accounting explicitly for object reflectance characteristics (e.g. human skin types for facial images and teeth reflectance for human jaw reconstruction) as well as complex illumination conditions. Experiments on synthetic and real images illustrate the robustness of the proposed appearance model vis-a-vis illumination variation. Keywords: computer vision, computer graphics, shading, illumination modeling, reflectance representation, image irradiance, frequency space representations, {hemi)spherical harmonics, analytic bilinear PCA, model-based bilinear PCA, 3D shape reconstruction, statistical shape from shading

    Innovative Techniques for Digitizing and Restoring Deteriorated Historical Documents

    Get PDF
    Recent large-scale document digitization initiatives have created new modes of access to modern library collections with the development of new hardware and software technologies. Most commonly, these digitization projects focus on accurately scanning bound texts, some reaching an efficiency of more than one million volumes per year. While vast digital collections are changing the way users access texts, current scanning paradigms can not handle many non-standard materials. Documentation forms such as manuscripts, scrolls, codices, deteriorated film, epigraphy, and rock art all hold a wealth of human knowledge in physical forms not accessible by standard book scanning technologies. This great omission motivates the development of new technology, presented by this thesis, that is not-only effective with deteriorated bound works, damaged manuscripts, and disintegrating photonegatives but also easily utilized by non-technical staff. First, a novel point light source calibration technique is presented that can be performed by library staff. Then, a photometric correction technique which uses known illumination and surface properties to remove shading distortions in deteriorated document images can be automatically applied. To complete the restoration process, a geometric correction is applied. Also unique to this work is the development of an image-based uncalibrated document scanner that utilizes the transmissivity of document substrates. This scanner extracts intrinsic document color information from one or both sides of a document. Simultaneously, the document shape is estimated to obtain distortion information. Lastly, this thesis provides a restoration framework for damaged photographic negatives that corrects photometric and geometric distortions. Current restoration techniques for the discussed form of negatives require physical manipulation to the photograph. The novel acquisition and restoration system presented here provides the first known solution to digitize and restore deteriorated photographic negatives without damaging the original negative in any way. This thesis work develops new methods of document scanning and restoration suitable for wide-scale deployment. By creating easy to access technologies, library staff can implement their own scanning initiatives and large-scale scanning projects can expand their current document-sets

    Neural representations for object capture and rendering

    Get PDF
    Photometric stereo is a classical computer vision problem with applications ranging from gaming, VR/AR avatars to movie visual effects which requires a faithful reconstruction of an object in a new space, and thus, there is a need to thoroughly understand the object’s visual properties. With the advent of Neural Radiance Fields (NeRFs) in the early 2020s, we witnessed the incredible photorealism provided by the method and its potential beyond. However, original NeRFs do not provide any information about the material and lighting of the objects in focus. Therefore, we propose to tackle the multiview photometric stereo problem using an extension of NeRFs. We provide three novel contributions through this work. First, the Relightable NeRF model, an extension of the original NeRF, where appearance is conditioned on a point light source direction. It provides two use cases - it is able to learn from varying lighting and relight under arbitrary conditions. Second, the Neural BRDF Fields which extends the relightable NeRF by introducing explicit models for surface reflectance and shadowing. The parameters of the BRDF are learnable as a neural field, enabling spatially varying reflectance. The local surface normal direction as another neural field is learned as well. We experiment with both a fixed BRDF (Lambertian) and a learnable (i.e. neural) reflectance model which guarantees a realistic BRDF by tieing the neural network to BRDF physical properties. In addition, it learns local shadowing as a function of light source direction enabling the reconstruction of cast shadows. Finally, the Neural Implicit Fields for Merging Monocular Photometric Stereo switches from NeRF’s volume density function to a signed distance function representation. This provides a straightforward means to compute the surface normal direction and, thus, ties normal-based losses directly to the geometry. We use this representation to address the problem of merging the output of monocular photometric stereo methods into a single unified model: a neural SDF and a neural field capturing diffuse albedo from which we can extract a textured mesh

    Gradient Domain Methods for Image-based Reconstruction and Rendering

    Get PDF
    This thesis describes new approaches in image-based 3D reconstruction and rendering. In contrast to previous work our algorithms focus on image gradients instead of pixel values which allows us to avoid many of the disadvantages traditional techniques have. A single pixel only carries very local information about the image content. A gradient on the other hand reveals information about the magnitude and the direction in which the image content changes. Our techniques use this additional information to adapt dynamically to the image content. Especially in image regions without strong gradients we can employ more suitable reconstruction models and we can render images with less artifacts. Overall we present more accurate and robust results (both 3D models and renderings) compared to previous methods. First, we present a multi-view stereo algorithm that combines traditional stereo reconstruction and shading based reconstruction models in a single optimization scheme. By defining as gradient based trade off our model removes the need for an explicit regularization and can handle shading information without the need for an explicit albedo model. This effectively combines the strength of both reconstruction approaches and cancels out their weaknesses. Our second method is an image-based rendering technique that directly renders gradients instead of pixels. The final image is then generated by integrating over the rendered gradients. We present a detailed description on how gradients can be moved directly in the image during rendering which allows us to create a fast approximation that improves the quality and speed of the integration step. Our method also handles occlusions and compared to traditional approaches we can achieve better results that are especially robust for scenes with reflective or textureless areas. Finally, we also present a new model for image warping. Here we apply different types of regularization constraints based on the gradients in the image. Especially when used for direct real-time rendering this can handle larger distortions compared to traditional methods that use only a single type of regularization. Overall the results of this thesis show how shifting the focus from image pixels to image gradients can improve various aspects of image-based reconstruction and rendering. Some of the most challenging aspects such as textureless areas in rendering and spatially varying albedo in reconstruction are handled implicitly by our formulations which also leads to more effective algorithms

    Measurement, modeling and perception of painted surfaces : A Multi-scale analysis of the touch-up problem

    Get PDF
    Real-world surfaces typically have geometric features at a range of spatial scales. At the microscale, opaque surfaces are often characterized by bidirectional reflectance distribution functions (BRDF), which describes how a surface scatters incident light. At the mesoscale, surfaces often exhibit visible texture - stochastic or patterned arrangements of geometric features that provide visual information about surface properties such as roughness, smoothness, softness, etc. These textures also affect how light is scattered by the surface, but the effects are at a different spatial scale than those captured by the BRDF. Through this research, we investigate how microscale and mesoscale surface properties interact to contribute to overall surface appearance. This behavior is also the cause of the well-known touch-up problem in the paint industry, where two regions coated with exactly the same paint, look different in color, gloss and/or texture because of differences in application methods. At first, samples were created by applying latex paint to standard wallboard surfaces. Two application methods- spraying and rolling were used. The BRDF and texture properties of the samples were measured, which revealed differences at both the microscale and mesoscale. This data was then used as input for a physically-based image synthesis algorithm, to generate realistic images of the surfaces under different viewing conditions. In order to understand the factors that govern touch-up visibility, psychophysical tests were conducted using calibrated, digital photographs of the samples as stimuli. Images were presented in pairs and a two alternative forced choice design was used for the experiments. These judgments were then used as data for a Thurstonian scaling analysis to produce psychophysical scales of visibility, which helped determine the effect of paint formulation, application methods, and viewing and illumination conditions on the touch-up problem. The results can be used as base data towards development of a psychophysical model that relates physical differences in paint formulation and application methods to visual differences in surface appearance
    • …
    corecore