32 research outputs found

    Practical SVBRDF Acquisition of 3D Objects with Unstructured Flash Photography

    Get PDF
    Capturing spatially-varying bidirectional reflectance distribution functions (SVBRDFs) of 3D objects with just a single, hand-held camera (such as an off-the-shelf smartphone or a DSLR camera) is a difficult, open problem. Previous works are either limited to planar geometry, or rely on previously scanned 3D geometry, thus limiting their practicality. There are several technical challenges that need to be overcome: First, the built-in flash of a camera is almost colocated with the lens, and at a fixed position; this severely hampers sampling procedures in the light-view space. Moreover, the near-field flash lights the object partially and unevenly. In terms of geometry, existing multiview stereo techniques assume diffuse reflectance only, which leads to overly smoothed 3D reconstructions, as we show in this paper. We present a simple yet powerful framework that removes the need for expensive, dedicated hardware, enabling practical acquisition of SVBRDF information from real-world, 3D objects with a single, off-the-shelf camera with a built-in flash. In addition, by removing the diffuse reflection assumption and leveraging instead such SVBRDF information, our method outputs high-quality 3D geometry reconstructions, including more accurate high-frequency details than state-of-the-art multiview stereo techniques. We formulate the joint reconstruction of SVBRDFs, shading normals, and 3D geometry as a multi-stage, iterative inverse-rendering reconstruction pipeline. Our method is also directly applicable to any existing multiview 3D reconstruction technique. We present results of captured objects with complex geometry and reflectance; we also validate our method numerically against other existing approaches that rely on dedicated hardware, additional sources of information, or both

    Towards Scalable Multi-View Reconstruction of Geometry and Materials

    Full text link
    In this paper, we propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes that exceed object-scale and hence cannot be captured with stationary light stages. The input are high-resolution RGB-D images captured by a mobile, hand-held capture system with point lights for active illumination. Compared to previous works that jointly estimate geometry and materials from a hand-held scanner, we formulate this problem using a single objective function that can be minimized using off-the-shelf gradient-based solvers. To facilitate scalability to large numbers of observation views and optimization variables, we introduce a distributed optimization algorithm that reconstructs 2.5D keyframe-based representations of the scene. A novel multi-view consistency regularizer effectively synchronizes neighboring keyframes such that the local optimization results allow for seamless integration into a globally consistent 3D model. We provide a study on the importance of each component in our formulation and show that our method compares favorably to baselines. We further demonstrate that our method accurately reconstructs various objects and materials and allows for expansion to spatially larger scenes. We believe that this work represents a significant step towards making geometry and material estimation from hand-held scanners scalable

    BRDF representation and acquisition

    Get PDF
    Photorealistic rendering of real world environments is important in a range of different areas; including Visual Special effects, Interior/Exterior Modelling, Architectural Modelling, Cultural Heritage, Computer Games and Automotive Design. Currently, rendering systems are able to produce photorealistic simulations of the appearance of many real-world materials. In the real world, viewer perception of objects depends on the lighting and object/material/surface characteristics, the way a surface interacts with the light and on how the light is reflected, scattered, absorbed by the surface and the impact these characteristics have on material appearance. In order to re-produce this, it is necessary to understand how materials interact with light. Thus the representation and acquisition of material models has become such an active research area. This survey of the state-of-the-art of BRDF Representation and Acquisition presents an overview of BRDF (Bidirectional Reflectance Distribution Function) models used to represent surface/material reflection characteristics, and describes current acquisition methods for the capture and rendering of photorealistic materials

    Self-Supervised Shape and Appearance Modeling via Neural Differentiable Graphics

    Get PDF
    Inferring 3D shape and appearance from natural images is a fundamental challenge in computer vision. Despite recent progress using deep learning methods, a key limitation is the availability of annotated training data, as acquisition is often very challenging and expensive, especially at a large scale. This thesis proposes to incorporate physical priors into neural networks that allow for self-supervised learning. As a result, easy-to-access unlabeled data can be used for model training. In particular, novel algorithms in the context of 3D reconstruction and texture/material synthesis are introduced, where only image data is available as supervisory signal. First, a method that learns to reason about 3D shape and appearance solely from unstructured 2D images, achieved via differentiable rendering in an adversarial fashion, is proposed. As shown next, learning from videos significantly improves 3D reconstruction quality. To this end, a novel ray-conditioned warp embedding is proposed that aggregates pixel-wise features from multiple source images. Addressing the challenging task of disentangling shape and appearance, first a method that enables 3D texture synthesis independent of shape or resolution is presented. For this purpose, 3D noise fields of different scales are transformed into stationary textures. The method is able to produce 3D textures, despite only requiring 2D textures for training. Lastly, the surface characteristics of textures under different illumination conditions are modeled in the form of material parameters. Therefore, a self-supervised approach is proposed that has no access to material parameters but only flash images. Similar to the previous method, random noise fields are reshaped to material parameters, which are conditioned to replicate the visual appearance of the input under matching light

    Utilisation de l'Apparence pour le Rendu et l'édition efficaces de scènes capturées

    Get PDF
    Computer graphics strives to render synthetic images identical to real photographs. Multiple rendering algorithms have been developed for the better part of the last half-century. Traditional algorithms use 3D assets manually generated by artists to render a scene. While the initial scenes were quite simple, the field has developed complex representations of geometry, material and lighting: the three basic components of a 3D scene. Generating such complex assets is hard and requires significant time and skills by professional 3D artists. In addition to asset generation, the rendering algorithms themselves involve complex simulation techniques to solve for global light transport in a scene which costs more time.As the ease of capturing photographs improved, Image-based Rendering (IBR) emerged as an alternative to traditional rendering. Using captured images as input became much faster than generating traditional scene assets. Initial IBR algorithms focused on creating a scene model using the input images to interpolate or warp them and enable free-viewpoint navigation of captured scenes. With time the scene models became more complex and using a geometric proxy computed from the input images became an integral part of IBR. Today using a mesh reconstructed using Structure-from-Motion (SfM) and Multi-view Stereo (MVS) techniques is widely used in IBR even though they introduce significant artifacts due to noisy reconstruction.In this thesis we first propose a novel image-based rendering algorithm, which focuses on rendering a captured scene with good quality at interactive frame rates}. We study different artifacts from previous IBR algorithms and propose an algorithm which builds upon previous work to remove such artifacts. The algorithm utilizes surface appearance in order to treat view-dependent regions differently than diffuse regions. Our Hybrid-IBR algorithm performs favorably against classical and modern IBR approaches for a wide variety of scenes in terms of quality and/or speed.While IBR provides solutions to render a scene, editing them is hard. Editing scenes require estimating a scene's geometry, material appearance and illumination. As our second contribution \textbf{we explicitly estimate \emph{scene-scale} material parameters from a set of captured photographs to enable scene editing}. While commercial photogrammetry solutions recover diffuse texture to aid 3D artists in generating material assets manually, we aim to \emph{automatically} create material texture atlases from captured images of a scene. We take advantage of the visual cues provided by the multi-view observations. Feeding it to a Convolutional Neural Network (CNN) we obtain material maps for each view. Using the predicted maps we create multi-view consistent material texture atlases by aggregating the information in texture space. Using our automatically generated material texture atlases we demonstrate relighting and object insertion in real scenes.Learning-based tasks require large amounts of data with variety to learn the task efficiently. Using synthetic datasets to train is the norm but using traditional rendering to render large datasets is time consuming providing limited variability. We propose \textbf{a new neural rendering-based approach that learns a neural scene representation with variability and use it to generate large amounts of data at a significantly faster rate on the fly}. We demonstrate the advantage of using neural rendering as compared to traditional rendering in terms of speed of generating dataset as well as learning auxiliary tasks given the same computational budget.L’informatique graphique a pour but de rendre des images de synthèse semblables à des photographies. Plusieurs algorithmes de rendu ont été développés au cours du dernier demi-siècle, principalement pour restituer des scènes à base d'éléments 3D créés par des artistes. Alors que les scènes initiales étaient assez simples, des représentations plus complexes de la géométrie, des matériaux et de l'éclairage ont été développés. Créer des scènes aussi complexes nécessite beaucoup de travail et de compétences de la part d'artistes 3D professionnels. Au même temps, les algorithmes de rendu impliquent des techniques de simulation complexes coûteuses en temps, pour résoudre le transport global de la lumière dans une scène.Avec la popularité grandissante de la photo numérique, le rendu basé image (IBR) a émergé comme une alternative au rendu traditionnel. Avec cette approche, l'utilisation de photos comme données d'entrée est devenue beaucoup plus rapide que la génération de scènes classiques. Les algorithmes IBR se sont d’abord concentrés sur la restitution de scènes pour en permettre une exploration libre. Au fil du temps, les modèles de scène sont devenus plus complexes et l'utilisation d'un proxy géométrique inféré à partir d’images est devenue la norme. Aujourd'hui, l'utilisation d'un maillage reconstruit à l'aide des techniques Structure-from-Motion (SfM) et Multi-view Stereo (MVS) est courante en IBR, bien que cette utilisation introduit des artefacts importants. Nous proposons d'abord un nouvel algorithme de rendu basé image, qui se concentre sur le rendu de qualité et en temps interactif d'une scène capturée}. Nous étudions différentes faiblesses des travaux précédents et proposons un algorithme qui s'appuie sur ces travaux pour obtenir de meilleurs résultats. Notre algorithme se base sur l'apparence de la surface pour traiter les régions dont l'apparence dépend de l'angle de vue différemment des régions diffuses. Hybrid-IBR obtient des résultats favorables par rapport aux approches concurrentes pour une grande variété de scènes en termes de qualité et/ou de vitesse.Bien que l'IBR soit une bonne solution de rendu, l'édition de celle-ci est difficile sans une décomposition en différents éléments : la géométrie, l'apparence des matériaux et l'éclairage de la scène. Pour notre deuxième contribution, \textbf{nous estimons explicitement les paramètres de matériaux à \emph{l'échelle de la scène} à partir d'un ensemble de photographies, pour permettre l'édition de la scène}. Alors que les solutions de photogrammétrie commerciales calculent la texture diffuse pour assister la création manuelle de matériaux, nous visons à créer \emph{automatiquement} des atlas de texture de matériaux à partir d'un ensemble d'images d'une scène. Nous nous appuyons sur les informations fournis par ces images et les transmettons à un réseau neuronal convolutif pour obtenir des cartes de matériaux pour chaque vue. En utilisant toutes ces prédictions, nous créons des atlas de texture de matériau cohérents pour toutes les vues en agrégeant les informations dans l'espace texture. Nous démontrons l'utilisation de notre atlas de texture de matériaux généré automatiquement pour rendre des scènes réelles avec un changement d’illumination et avec des objets virtuels insérés.L'apprentissage profond nécessite de grandes quantités de données variées. L'utilisation de données synthétiques est courante, mais l'utilisation du rendu traditionnel pour créer ces données prend du temps et offre une variabilité limitée. Nous proposons \textbf{une nouvelle approche basée sur le rendu neuronal qui apprend une représentation de scène neuronale avec paramètres variables, et l'utilise pour générer au vol de grandes quantités de données à un rythme beaucoup plus rapide}. Nous démontrons l'avantage d'utiliser le rendu neuronal par rapport au rendu traditionnel en termes de budget de temps, ainsi que pour l'apprentissage de tâches auxiliaires avec le même budget de calcul

    Deep scene-scale material estimation from multi-view indoor captures

    Get PDF
    International audienceThe movie and video game industries have adopted photogrammetry as a way to create digital 3D assets from multiple photographs of a real-world scene. But photogrammetry algorithms typically output an RGB texture atlas of the scene that only serves as visual guidance for skilled artists to create material maps suitable for physically-based rendering. We present a learning-based approach that automatically produces digital assets ready for physically-based rendering, by estimating approximate material maps from multi-view captures of indoor scenes that are used with retopologized geometry. We base our approach on a material estimation Convolutional Neural Network (CNN) that we execute on each input image. We leverage the view-dependent visual cues provided by the multiple observations of the scene by gathering, for each pixel of a given image, the color of the corresponding point in other images. This image-space CNN provides us with an ensemble of predictions, which we merge in texture space as the last step of our approach. Our results demonstrate that the recovered assets can be directly used for physically-based rendering and editing of real indoor scenes from any viewpoint and novel lighting. Our method generates approximate material maps in a fraction of time compared to the closest previous solutions

    NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown Illumination

    Full text link
    We address the problem of recovering the shape and spatially-varying reflectance of an object from multi-view images (and their camera poses) of an object illuminated by one unknown lighting condition. This enables the rendering of novel views of the object under arbitrary environment lighting and editing of the object's material properties. The key to our approach, which we call Neural Radiance Factorization (NeRFactor), is to distill the volumetric geometry of a Neural Radiance Field (NeRF) [Mildenhall et al. 2020] representation of the object into a surface representation and then jointly refine the geometry while solving for the spatially-varying reflectance and environment lighting. Specifically, NeRFactor recovers 3D neural fields of surface normals, light visibility, albedo, and Bidirectional Reflectance Distribution Functions (BRDFs) without any supervision, using only a re-rendering loss, simple smoothness priors, and a data-driven BRDF prior learned from real-world BRDF measurements. By explicitly modeling light visibility, NeRFactor is able to separate shadows from albedo and synthesize realistic soft or hard shadows under arbitrary lighting conditions. NeRFactor is able to recover convincing 3D models for free-viewpoint relighting in this challenging and underconstrained capture setup for both synthetic and real scenes. Qualitative and quantitative experiments show that NeRFactor outperforms classic and deep learning-based state of the art across various tasks. Our videos, code, and data are available at people.csail.mit.edu/xiuming/projects/nerfactor/.Comment: Camera-ready version for SIGGRAPH Asia 2021. Project Page: https://people.csail.mit.edu/xiuming/projects/nerfactor
    corecore