2 research outputs found
Automated inverse-rendering techniques for realistic 3D artefact compositing in 2D photographs
PhD ThesisThe process of acquiring images of a scene and modifying the defining structural features
of the scene through the insertion of artefacts is known in literature as compositing. The
process can take effect in the 2D domain (where the artefact originates from a 2D image
and is inserted into a 2D image), or in the 3D domain (the artefact is defined as a dense
3D triangulated mesh, with textures describing its material properties).
Compositing originated as a solution to enhancing, repairing, and more broadly editing
photographs and video data alike in the film industry as part of the post-production stage.
This is generally thought of as carrying out operations in a 2D domain (a single image
with a known width, height, and colour data). The operations involved are sequential and
entail separating the foreground from the background (matting), or identifying features
from contour (feature matching and segmentation) with the purpose of introducing new
data in the original. Since then, compositing techniques have gained more traction in the
emerging fields of Mixed Reality (MR), Augmented Reality (AR), robotics and machine
vision (scene understanding, scene reconstruction, autonomous navigation). When focusing
on the 3D domain, compositing can be translated into a pipeline 1 - the incipient stage
acquires the scene data, which then undergoes a number of processing steps aimed at
inferring structural properties that ultimately allow for the placement of 3D artefacts
anywhere within the scene, rendering a plausible and consistent result with regard to the
physical properties of the initial input.
This generic approach becomes challenging in the absence of user annotation and
labelling of scene geometry, light sources and their respective magnitude and orientation,
as well as a clear object segmentation and knowledge of surface properties. A single image,
a stereo pair, or even a short image stream may not hold enough information regarding
the shape or illumination of the scene, however, increasing the input data will only incur
an extensive time penalty which is an established challenge in the field.
Recent state-of-the-art methods address the difficulty of inference in the absence of
1In the present document, the term pipeline refers to a software solution formed of stand-alone modules
or stages. It implies that the flow of execution runs in a single direction, and that each module has the
potential to be used on its own as part of other solutions. Moreover, each module is assumed to take an
input set and output data for the following stage, where each module addresses a single type of problem
only.
data, nonetheless, they do not attempt to solve the challenge of compositing artefacts
between existing scene geometry, or cater for the inclusion of new geometry behind complex
surface materials such as translucent glass or in front of reflective surfaces.
The present work focuses on the compositing in the 3D domain and brings forth a
software framework 2 that contributes solutions to a number of challenges encountered in
the field, including the ability to render physically-accurate soft shadows in the absence
of user annotate scene properties or RGB-D data. Another contribution consists in the
timely manner in which the framework achieves a believable result compared to the other
compositing methods which rely on offline rendering. The availability of proprietary
hardware and user expertise are two of the main factors that are not required in order to
achieve a fast and reliable results within the current framework