17 research outputs found

    Material acquisition using deep learning

    Get PDF
    International audienceTexture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in pictures. Designing algorithms able to leverage these cues to recover spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a few images has challenged computer graphics researchers for decades. I explore the use of deep learning to tackle lightweight appearance capture and make sense of these visual cues. Our networks are capable of recovering per-pixel normals, diffuse albedo, specular albedo and specular roughness from as little as one picture of a flat surface lit by a hand-held flash. We propose a method which improves its prediction with the number of input pictures, and reaches high quality reconstructions with up to 10 images -- a sweet spot between existing single-image and complex multi-image approaches. We introduce several innovations on training data acquisition and network design, bringing clear improvement over the state of the art for lightweight material capture

    Flexible SVBRDF Capture with a Multi-Image Deep Network

    Get PDF
    International audienceEmpowered by deep learning, recent methods for material capture can estimate a spatially-varying reflectance from a single photograph. Such lightweight capture is in stark contrast with the tens or hundreds of pictures required by traditional optimization-based approaches. However, a single image is often simply not enough to observe the rich appearance of real-world materials. We present a deep-learning method capable of estimating material appearance from a variable number of uncalibrated and unordered pictures captured with a handheld camera and flash. Thanks to an order-independent fusing layer, this architecture extracts the most useful information from each picture, while benefiting from strong priors learned from data. The method can handle both view and light direction variation without calibration. We show how our method improves its prediction with the number of input pictures, and reaches high quality reconstructions with as little as 1 to 10 images-a sweet spot between existing single-image and complex multi-image approaches

    Node Graph Optimization Using Differentiable Proxies

    Full text link
    Graph-based procedural materials are ubiquitous in content production industries. Procedural models allow the creation of photorealistic materials with parametric control for flexible editing of appearance. However, designing a specific material is a time-consuming process in terms of building a model and fine-tuning parameters. Previous work [Hu et al. 2022; Shi et al. 2020] introduced material graph optimization frameworks for matching target material samples. However, these previous methods were limited to optimizing differentiable functions in the graphs. In this paper, we propose a fully differentiable framework which enables end-to-end gradient based optimization of material graphs, even if some functions of the graph are non-differentiable. We leverage the Differentiable Proxy, a differentiable approximator of a non-differentiable black-box function. We use our framework to match structure and appearance of an output material to a target material, through a multi-stage differentiable optimization. Differentiable Proxies offer a more general optimization solution to material appearance matching than previous work

    Generating Procedural Materials from Text or Image Prompts

    Full text link
    Node graph systems are used ubiquitously for material design in computer graphics. They allow the use of visual programming to achieve desired effects without writing code. As high-level design tools they provide convenience and flexibility, but mastering the creation of node graphs usually requires professional training. We propose an algorithm capable of generating multiple node graphs from different types of prompts, significantly lowering the bar for users to explore a specific design space. Previous work was limited to unconditional generation of random node graphs, making the generation of an envisioned material challenging. We propose a multi-modal node graph generation neural architecture for high-quality procedural material synthesis which can be conditioned on different inputs (text or image prompts), using a CLIP-based encoder. We also create a substantially augmented material graph dataset, key to improving the generation quality. Finally, we generate high-quality graph samples using a regularized sampling process and improve the matching quality by differentiable optimization for top-ranked samples. We compare our methods to CLIP-based database search baselines (which are themselves novel) and achieve superior or similar performance without requiring massive data storage. We further show that our model can produce a set of material graphs unconditionally, conditioned on images, text prompts or partial graphs, serving as a tool for automatic visual programming completion

    ControlMat: A Controlled Generative Approach to Material Capture

    Full text link
    Material reconstruction from a photograph is a key component of 3D content creation democratization. We propose to formulate this ill-posed problem as a controlled synthesis one, leveraging the recent progress in generative deep networks. We present ControlMat, a method which, given a single photograph with uncontrolled illumination as input, conditions a diffusion model to generate plausible, tileable, high-resolution physically-based digital materials. We carefully analyze the behavior of diffusion models for multi-channel outputs, adapt the sampling process to fuse multi-scale information and introduce rolled diffusion to enable both tileability and patched diffusion for high-resolution outputs. Our generative approach further permits exploration of a variety of materials which could correspond to the input image, mitigating the unknown lighting conditions. We show that our approach outperforms recent inference and latent-space-optimization methods, and carefully validate our diffusion process design choices. Supplemental materials and additional details are available at: https://gvecchio.com/controlmat/

    PhotoMat: A Material Generator Learned from Single Flash Photos

    Full text link
    Authoring high-quality digital materials is key to realism in 3D rendering. Previous generative models for materials have been trained exclusively on synthetic data; such data is limited in availability and has a visual gap to real materials. We circumvent this limitation by proposing PhotoMat: the first material generator trained exclusively on real photos of material samples captured using a cell phone camera with flash. Supervision on individual material maps is not available in this setting. Instead, we train a generator for a neural material representation that is rendered with a learned relighting module to create arbitrarily lit RGB images; these are compared against real photos using a discriminator. We then train a material maps estimator to decode material reflectance properties from the neural material representation. We train PhotoMat with a new dataset of 12,000 material photos captured with handheld phone cameras under flash lighting. We demonstrate that our generated materials have better visual quality than previous material generators trained on synthetic data. Moreover, we can fit analytical material models to closely match these generated neural materials, thus allowing for further editing and use in 3D rendering

    Single-Image SVBRDF Capture with a Rendering-Aware Deep Network

    Get PDF
    International audienceTexture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in single pictures. Yet, recovering spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a single image based on such cues has challenged researchers in computer graphics for decades. We tackle lightweight appearance capture by training a deep neural network to automatically extract and make sense of these visual cues. Once trained, our network is capable of recovering per-pixel normal, diffuse albedo, specular albedo and specular roughness from a single picture of a flat surface lit by a hand-held flash. We achieve this goal by introducing several innovations on training data acquisition and network design. For training, we leverage a large dataset of artist-created, procedural SVBRDFs which we sample and render under multiple lighting directions. We further amplify the data by material mixing to cover a wide diversity of shading effects, which allows our network to work across many material classes. Motivated by the observation that distant regions of a material sample often offer complementary visual cues, we design a network that combines an encoder-decoder convolutional track for local feature extraction with a fully-connected track for global feature extraction and propagation. Many important material effects are view-dependent, and as such ambiguous when observed in a single image. We tackle this challenge by defining the loss as a differentiable SVBRDF similarity metric that compares the renderings of the predicted maps against renderings of the ground truth from several lighting and viewing directions. Combined together, these novel ingredients bring clear improvement over state of the art methods for single-shot capture of spatially varying BRDFs

    Acquisition légère de matériaux par apprentissage profond

    No full text
    Whether it is used for entertainment or industrial design, computer graphics is ever more present in our everyday life. Yet, reproducing a real scene appearance in a virtual environment remains a challenging task, requiring long hours from trained artists. A good solution is the acquisition of geometries and materials directly from real world examples, but this often comes at the cost of complex hardware and calibration processes. In this thesis, we focus on lightweight material appearance capture to simplify and accelerate the acquisition process and solve industrial challenges such as result image resolution or calibration. Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in pictures. Designing algorithms able to leverage these cues to recover spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a few images has challenged computer graphics researchers for decades. We explore the use of deep learning to tackle lightweight appearance capture and make sense of these visual cues. Once trained, our networks are capable of recovering per-pixel normals, diffuse albedo, specular albedo and specular roughness from as little as one picture of a flat surface lit by the environment or a hand-held flash. We show how our method improves its prediction with the number of input pictures to reach high quality reconstructions with up to 10 images --- a sweet spot between existing single-image and complex multi-image approaches --- and allows to capture large scale, HD materials. We achieve this goal by introducing several innovations on training data acquisition and network design, bringing clear improvement over the state of the art for lightweight material capture.Que ce soit pour le divertissement ou le design industriel, l’infographie est de plus en plus présente dans notre vie quotidienne. Cependant, reproduire une scène réelle dans un environnement virtuel reste une tâche complexe, nécessitant de nombreuses heures de travail. L’acquisition de géométries et de matériaux à partir d’exemples réels est une solution, mais c’est souvent au prix de processus d'acquisitions et de calibrations complexes. Dans cette thèse, nous nous concentrons sur la capture légère de matériaux afin de simplifier et d’accélérer le processus d’acquisition et de résoudre les défis industriels tels que la calibration des résultats. Les textures et les ombres sont quelques-uns des nombreux indices visuels qui permettent aux humains de comprendre l'apparence d'un matériau à partir d'une seule image. La conception d'algorithmes capables de tirer parti de ces indices pour récupérer des fonctions de distribution de réflectance bidirectionnelles (SVBRDF) variant dans l'espace à partir de quelques images pose un défi aux chercheurs en infographie depuis des décennies. Nous explorons l'utilisation de l'apprentissage profond pour la capture légère de matériaux et analyser ces indices visuels. Une fois entraînés, nos réseaux sont capables d'évaluer, par pixel, les normales, les albedos diffus et spéculaires et une rugosité à partir d’une seule image d’une surface plane éclairée par l'environnement ou un flash tenu à la main. Nous montrons également comment notre méthode améliore ses prédictions avec le nombre d'images en entrée et permet des reconstructions de haute qualité en utilisant jusqu'à 10 images d'entrées --- un bon compromis entre les approches existantes

    Lightweight material acquisition using deep learning

    No full text
    Que ce soit pour le divertissement ou le design industriel, l’infographie est de plus en plus présente dans notre vie quotidienne. Cependant, reproduire une scène réelle dans un environnement virtuel reste une tâche complexe, nécessitant de nombreuses heures de travail. L’acquisition de géométries et de matériaux à partir d’exemples réels est une solution, mais c’est souvent au prix de processus d'acquisitions et de calibrations complexes. Dans cette thèse, nous nous concentrons sur la capture légère de matériaux afin de simplifier et d’accélérer le processus d’acquisition et de résoudre les défis industriels tels que la calibration des résultats. Les textures et les ombres sont quelques-uns des nombreux indices visuels qui permettent aux humains de comprendre l'apparence d'un matériau à partir d'une seule image. La conception d'algorithmes capables de tirer parti de ces indices pour récupérer des fonctions de distribution de réflectance bidirectionnelles (SVBRDF) variant dans l'espace à partir de quelques images pose un défi aux chercheurs en infographie depuis des décennies. Nous explorons l'utilisation de l'apprentissage profond pour la capture légère de matériaux et analyser ces indices visuels. Une fois entraînés, nos réseaux sont capables d'évaluer, par pixel, les normales, les albedos diffus et spéculaires et une rugosité à partir d’une seule image d’une surface plane éclairée par l'environnement ou un flash tenu à la main. Nous montrons également comment notre méthode améliore ses prédictions avec le nombre d'images en entrée et permet des reconstructions de haute qualité en utilisant jusqu'à 10 images d'entrées --- un bon compromis entre les approches existantes.Whether it is used for entertainment or industrial design, computer graphics is ever more present in our everyday life. Yet, reproducing a real scene appearance in a virtual environment remains a challenging task, requiring long hours from trained artists. A good solution is the acquisition of geometries and materials directly from real world examples, but this often comes at the cost of complex hardware and calibration processes. In this thesis, we focus on lightweight material appearance capture to simplify and accelerate the acquisition process and solve industrial challenges such as result image resolution or calibration. Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in pictures. Designing algorithms able to leverage these cues to recover spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a few images has challenged computer graphics researchers for decades. We explore the use of deep learning to tackle lightweight appearance capture and make sense of these visual cues. Once trained, our networks are capable of recovering per-pixel normals, diffuse albedo, specular albedo and specular roughness from as little as one picture of a flat surface lit by the environment or a hand-held flash. We show how our method improves its prediction with the number of input pictures to reach high quality reconstructions with up to 10 images --- a sweet spot between existing single-image and complex multi-image approaches --- and allows to capture large scale, HD materials. We achieve this goal by introducing several innovations on training data acquisition and network design, bringing clear improvement over the state of the art for lightweight material capture
    corecore