15 research outputs found
Material acquisition using deep learning
International audienceTexture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in pictures. Designing algorithms able to leverage these cues to recover spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a few images has challenged computer graphics researchers for decades. I explore the use of deep learning to tackle lightweight appearance capture and make sense of these visual cues. Our networks are capable of recovering per-pixel normals, diffuse albedo, specular albedo and specular roughness from as little as one picture of a flat surface lit by a hand-held flash. We propose a method which improves its prediction with the number of input pictures, and reaches high quality reconstructions with up to 10 images -- a sweet spot between existing single-image and complex multi-image approaches. We introduce several innovations on training data acquisition and network design, bringing clear improvement over the state of the art for lightweight material capture
Perspective Plane Program Induction from a Single Image
We study the inverse graphics problem of inferring a holistic representation
for natural images. Given an input image, our goal is to induce a
neuro-symbolic, program-like representation that jointly models camera poses,
object locations, and global scene structures. Such high-level, holistic scene
representations further facilitate low-level image manipulation tasks such as
inpainting. We formulate this problem as jointly finding the camera pose and
scene structure that best describe the input image. The benefits of such joint
inference are two-fold: scene regularity serves as a new cue for perspective
correction, and in turn, correct perspective correction leads to a simplified
scene structure, similar to how the correct shape leads to the most regular
texture in shape from texture. Our proposed framework, Perspective Plane
Program Induction (P3I), combines search-based and gradient-based algorithms to
efficiently solve the problem. P3I outperforms a set of baselines on a
collection of Internet images, across tasks including camera pose estimation,
global structure inference, and down-stream image manipulation tasks.Comment: CVPR 2020. First two authors contributed equally. Project page:
http://p3i.csail.mit.edu