59 research outputs found
Capturing and Reconstructing the Appearance of Complex {3D} Scenes
In this thesis, we present our research on new acquisition methods for reflectance properties of real-world objects. Specifically, we first show a method for acquiring spatially varying densities in volumes of translucent, gaseous material with just a single image. This makes the method applicable to constantly changing phenomena like smoke without the use of high-speed camera equipment. Furthermore, we investigated how two well known techniques -- synthetic aperture confocal imaging and algorithmic descattering -- can be combined to help looking through a translucent medium like fog or murky water. We show that the depth at which we can still see an object embedded in the scattering medium is increased. In a related publication, we show how polarization and descattering based on phase-shifting can be combined for efficient 3D~scanning of translucent objects. Normally, subsurface scattering hinders the range estimation by offsetting the peak intensity beneath the surface away from the point of incidence. With our method, the subsurface scattering is reduced to a minimum and therefore reliable 3D~scanning is made possible. Finally, we present a system which recovers surface geometry, reflectance properties of opaque objects, and prevailing lighting conditions at the time of image capture from just a small number of input photographs. While there exist previous approaches to recover reflectance properties, our system is the first to work on images taken under almost arbitrary, changing lighting conditions. This enables us to use images we took from a community photo collection website
Learning Object-Centric Neural Scattering Functions for Free-viewpoint Relighting and Scene Composition
Photorealistic object appearance modeling from 2D images is a constant topic
in vision and graphics. While neural implicit methods (such as Neural Radiance
Fields) have shown high-fidelity view synthesis results, they cannot relight
the captured objects. More recent neural inverse rendering approaches have
enabled object relighting, but they represent surface properties as simple
BRDFs, and therefore cannot handle translucent objects. We propose
Object-Centric Neural Scattering Functions (OSFs) for learning to reconstruct
object appearance from only images. OSFs not only support free-viewpoint object
relighting, but also can model both opaque and translucent objects. While
accurately modeling subsurface light transport for translucent objects can be
highly complex and even intractable for neural methods, OSFs learn to
approximate the radiance transfer from a distant light to an outgoing direction
at any spatial location. This approximation avoids explicitly modeling complex
subsurface scattering, making learning a neural implicit model tractable.
Experiments on real and synthetic data show that OSFs accurately reconstruct
appearances for both opaque and translucent objects, allowing faithful
free-viewpoint relighting as well as scene composition. Project website:
https://kovenyu.com/osf/Comment: Project website: https://kovenyu.com/osf/ Journal extension of
arXiv:2012.08503. The first two authors contributed equally to this wor
Neural Relightable Participating Media Rendering
Learning neural radiance fields of a scene has recently allowed realistic
novel view synthesis of the scene, but they are limited to synthesize images
under the original fixed lighting condition. Therefore, they are not flexible
for the eagerly desired tasks like relighting, scene editing and scene
composition. To tackle this problem, several recent methods propose to
disentangle reflectance and illumination from the radiance field. These methods
can cope with solid objects with opaque surfaces but participating media are
neglected. Also, they take into account only direct illumination or at most
one-bounce indirect illumination, thus suffer from energy loss due to ignoring
the high-order indirect illumination. We propose to learn neural
representations for participating media with a complete simulation of global
illumination. We estimate direct illumination via ray tracing and compute
indirect illumination with spherical harmonics. Our approach avoids computing
the lengthy indirect bounces and does not suffer from energy loss. Our
experiments on multiple scenes show that our approach achieves superior visual
quality and numerical performance compared to state-of-the-art methods, and it
can generalize to deal with solid objects with opaque surfaces as well.Comment: Accepted to NeurIPS 202
Automated inverse-rendering techniques for realistic 3D artefact compositing in 2D photographs
PhD ThesisThe process of acquiring images of a scene and modifying the defining structural features
of the scene through the insertion of artefacts is known in literature as compositing. The
process can take effect in the 2D domain (where the artefact originates from a 2D image
and is inserted into a 2D image), or in the 3D domain (the artefact is defined as a dense
3D triangulated mesh, with textures describing its material properties).
Compositing originated as a solution to enhancing, repairing, and more broadly editing
photographs and video data alike in the film industry as part of the post-production stage.
This is generally thought of as carrying out operations in a 2D domain (a single image
with a known width, height, and colour data). The operations involved are sequential and
entail separating the foreground from the background (matting), or identifying features
from contour (feature matching and segmentation) with the purpose of introducing new
data in the original. Since then, compositing techniques have gained more traction in the
emerging fields of Mixed Reality (MR), Augmented Reality (AR), robotics and machine
vision (scene understanding, scene reconstruction, autonomous navigation). When focusing
on the 3D domain, compositing can be translated into a pipeline 1 - the incipient stage
acquires the scene data, which then undergoes a number of processing steps aimed at
inferring structural properties that ultimately allow for the placement of 3D artefacts
anywhere within the scene, rendering a plausible and consistent result with regard to the
physical properties of the initial input.
This generic approach becomes challenging in the absence of user annotation and
labelling of scene geometry, light sources and their respective magnitude and orientation,
as well as a clear object segmentation and knowledge of surface properties. A single image,
a stereo pair, or even a short image stream may not hold enough information regarding
the shape or illumination of the scene, however, increasing the input data will only incur
an extensive time penalty which is an established challenge in the field.
Recent state-of-the-art methods address the difficulty of inference in the absence of
1In the present document, the term pipeline refers to a software solution formed of stand-alone modules
or stages. It implies that the flow of execution runs in a single direction, and that each module has the
potential to be used on its own as part of other solutions. Moreover, each module is assumed to take an
input set and output data for the following stage, where each module addresses a single type of problem
only.
data, nonetheless, they do not attempt to solve the challenge of compositing artefacts
between existing scene geometry, or cater for the inclusion of new geometry behind complex
surface materials such as translucent glass or in front of reflective surfaces.
The present work focuses on the compositing in the 3D domain and brings forth a
software framework 2 that contributes solutions to a number of challenges encountered in
the field, including the ability to render physically-accurate soft shadows in the absence
of user annotate scene properties or RGB-D data. Another contribution consists in the
timely manner in which the framework achieves a believable result compared to the other
compositing methods which rely on offline rendering. The availability of proprietary
hardware and user expertise are two of the main factors that are not required in order to
achieve a fast and reliable results within the current framework
Leaming Visual Appearance: Perception, Modeling and Editing.
La apariencia visual determina como entendemos un objecto o imagen, y, por tanto, es un aspecto fundamental en la creación de contenido digital. Es un término general, englobando otros como la apariencia de los materiales, definida como la impresión que tenemos de un material, y la cual supone una interacción física entre luz y materia, y como nuestro sistema visual es capaz de percibirla. Sin embargo, modelar computacionalmente el comportamiento de nuestro sistema visual es una tarea difícil, entre otros motivos porque no existe una teoría definitiva y unificada sobre la percepción visual humana. Además, aunque hemos desarrollado algoritmos capaces de modelar fehacientemente la interacción entre luz y materia, existe una desconexión entre los parámetros físicos que usan estos algoritmos, y los parámetros perceptuales que el sistema visual humano entiende. Esto hace que manipular estas representaciones físicas, y sus interacciones, sea una tarea tediosa y costosa, incluso para usuarios expertos. Esta tesis busca mejorar nuestra comprensión de la percepción de la apariencia de materiales y usar dicho conocimiento para mejorar los algoritmos existentes para la generación de contenido visual. Específicamente, la tesis tiene contribuciones en tres áreas: proponiendo nuevos modelos computacionales para medir la similitud de apariencia; investigando la interacción entre iluminación y geometría; y desarrollando aplicaciones intuitivas para la manipulación de apariencia, en concreto, para el re-iluminado de humanos y para editar la apariencia de materiales.Una primera parte de la tesis explora métodos para medir la similaridad de apariencia. Ser capaces de medir cómo de similares son dos materiales, o imágenes, es un problema clásico en campos de la computación visual como visión por computador o informática gráfica. Abordamos primero el problema de similaridad en la apariencia de materiales. Proponemos un método basado en deep learning que combina imágenes con juicios subjetivos sobre la similitud de materiales, recogidos mediante estudios de usuario. Por otro lado, se explora el problema de la similaridad entre iconos. En este segundo caso, se hace uso de redes neuronales siamesas, y el estilo y la identidad que dan los artistas juega un papel clave en dicha medida de similaridad. La segunda parte avanza en la comprensión de cómo los factores de confusión (confounding factors) afectan a nuestra percepción de la apariencia de los materiales. Dos factores de confusión claves son la geometría de los objetos y la iluminación de la escena. Comenzamos investigando el efecto de dichos factores a la hora de reconocer los materiales a través de diversos experimentos y estudios estadísticos. También investigamos el efecto del movimiento del objeto en la percepción de la apariencia de materiales.En la tercera parte exploramos aplicaciones intuitivas para la manipulación de la apariencia visual. Primero, abordamos el problema de la re-iluminación de humanos. Proponemos una nueva formulación del problema, y basándonos en ella, se diseña y entrena un modelo basado en redes neuronales profundas para re-iluminar una escena. Por último, abordamos el problema de la edición intuitiva de materiales. Para ello, recopilamos juicios humanos sobre la percepción de diferentes atributos y presentamos un modelo, basado en redes neuronales profundas, capaz de editar materiales de forma realista simplemente variando el valor de los atributos recogidos.<br /
BSSRDF estimation from single images
We present a novel method to estimate an approximation of the reflectance characteristics of optically thick, homogeneous translucent materials using only a single photograph as input. First, we approximate the diffusion profile as a linear combination of piecewise constant functions, an approach that enables a linear system minimization and maximizes robustness in the presence of suboptimal input data inferred from the image. We then fit to a smoother monotonically decreasing model, ensuring continuity on its first derivative. We show the feasibility of our approach and validate it in controlled environments, comparing well against physical measurements from previous works. Next, we explore the performance of our method in uncontrolled scenarios, where neither lighting nor geometry are known. We show that these can be roughly approximated from the corresponding image by making two simple assumptions: that the object is lit by a distant light source and that it is globally convex, allowing us to capture the visual appearance of the photographed material. Compared with previous works, our technique offers an attractive balance between visual accuracy and ease of use, allowing its use in a wide range of scenarios including off-the-shelf, single images, thus extending the current repertoire of real-world data acquisition techniques
State of the Art on Neural Rendering
Efficient rendering of photo-realistic virtual worlds is a long standing effort of computer graphics. Modern graphics techniques have succeeded in synthesizing photo-realistic images from hand-crafted scene representations. However, the automatic generation of shape, materials, lighting, and other aspects of scenes remains a challenging problem that, if solved, would make photo-realistic computer graphics more widely accessible. Concurrently, progress in computer vision and machine learning have given rise to a new approach to image synthesis and editing, namely deep generative models. Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering into network training. With a plethora of applications in computer graphics and vision, neural rendering is poised to become a new area in the graphics community, yet no survey of this emerging field exists. This state-of-the-art report summarizes the recent trends and applications of neural rendering. We focus on approaches that combine classic computer graphics techniques with deep generative models to obtain controllable and photo-realistic outputs. Starting with an overview of the underlying computer graphics and machine learning concepts, we discuss critical aspects of neural rendering approaches. This state-of-the-art report is focused on the many important use cases for the described algorithms such as novel view synthesis, semantic photo manipulation, facial and body reenactment, relighting, free-viewpoint video, and the creation of photo-realistic avatars for virtual and augmented reality telepresence. Finally, we conclude with a discussion of the social implications of such technology and investigate open research problems
An intuitive control space for material appearance
Many different techniques for measuring material appearance have been
proposed in the last few years. These have produced large public datasets,
which have been used for accurate, data-driven appearance modeling. However,
although these datasets have allowed us to reach an unprecedented level of
realism in visual appearance, editing the captured data remains a challenge. In
this paper, we present an intuitive control space for predictable editing of
captured BRDF data, which allows for artistic creation of plausible novel
material appearances, bypassing the difficulty of acquiring novel samples. We
first synthesize novel materials, extending the existing MERL dataset up to 400
mathematically valid BRDFs. We then design a large-scale experiment, gathering
56,000 subjective ratings on the high-level perceptual attributes that best
describe our extended dataset of materials. Using these ratings, we build and
train networks of radial basis functions to act as functionals mapping the
perceptual attributes to an underlying PCA-based representation of BRDFs. We
show that our functionals are excellent predictors of the perceived attributes
of appearance. Our control space enables many applications, including intuitive
material editing of a wide range of visual properties, guidance for gamut
mapping, analysis of the correlation between perceptual attributes, or novel
appearance similarity metrics. Moreover, our methodology can be used to derive
functionals applicable to classic analytic BRDF representations. We release our
code and dataset publicly, in order to support and encourage further research
in this direction
- …