365 research outputs found
Towards Real-time Mixed Reality Matting In Natural Scenes
In Mixed Reality scenarios, background replacement is a common way to immerse a user in a synthetic environment. Properly identifying the background pixels in an image or video is a dif- ficult problem known as matting. Proper alpha mattes usually come from human guidance, special hardware setups, or color dependent algorithms. This is a consequence of the under-constrained nature of the per pixel alpha blending equation. In constant color matting, research identifies and replaces a background that is a single color, known as the chroma key color. Unfortunately, the algorithms force a controlled physical environment and favor constant, uniform lighting. More generic approaches, such as natural image matting, have made progress finding alpha matte solutions in environments with naturally occurring backgrounds. However, even for the quicker algorithms, the generation of trimaps, indicating regions of known foreground and background pixels, normally requires human interaction or offline computation. This research addresses ways to automatically solve an alpha matte for an image in realtime, and by extension a video, using a consumer level GPU. It does so even in the context of noisy environments that result in less reliable constraints than found in controlled settings. To attack these challenges, we are particularly interested in automatically generating trimaps from depth buffers for dynamic scenes so that algorithms requiring more dense constraints may be used. The resulting computation is parallelizable so that it may run on a GPU and should work for natural images as well as chroma key backgrounds. Extra input may be required, but when this occurs, commodity hardware available in most Mixed Reality setups should be able to provide the input. This allows us to provide real-time alpha mattes for Mixed Reality scenarios that take place in relatively controlled environments. As a consequence, while monochromatic backdrops (such as green screens or retro-reflective material) aid the algorithmâs accuracy, they are not an explicit requirement. iii Finally we explore a sub-image based approach to parallelize an existing hierarchical approach on high resolution imagery. We show that locality can be exploited to significantly reduce the memory and compute requirements of previously necessary when computing alpha mattes of high resolution images. We achieve this using a parallelizable scheme that is both independent of the matting algorithm and image features. Combined, these research topics provide a basis for Mixed Reality scenarios using real-time natural image matting on high definition video sources
Automated inverse-rendering techniques for realistic 3D artefact compositing in 2D photographs
PhD ThesisThe process of acquiring images of a scene and modifying the defining structural features
of the scene through the insertion of artefacts is known in literature as compositing. The
process can take effect in the 2D domain (where the artefact originates from a 2D image
and is inserted into a 2D image), or in the 3D domain (the artefact is defined as a dense
3D triangulated mesh, with textures describing its material properties).
Compositing originated as a solution to enhancing, repairing, and more broadly editing
photographs and video data alike in the film industry as part of the post-production stage.
This is generally thought of as carrying out operations in a 2D domain (a single image
with a known width, height, and colour data). The operations involved are sequential and
entail separating the foreground from the background (matting), or identifying features
from contour (feature matching and segmentation) with the purpose of introducing new
data in the original. Since then, compositing techniques have gained more traction in the
emerging fields of Mixed Reality (MR), Augmented Reality (AR), robotics and machine
vision (scene understanding, scene reconstruction, autonomous navigation). When focusing
on the 3D domain, compositing can be translated into a pipeline 1 - the incipient stage
acquires the scene data, which then undergoes a number of processing steps aimed at
inferring structural properties that ultimately allow for the placement of 3D artefacts
anywhere within the scene, rendering a plausible and consistent result with regard to the
physical properties of the initial input.
This generic approach becomes challenging in the absence of user annotation and
labelling of scene geometry, light sources and their respective magnitude and orientation,
as well as a clear object segmentation and knowledge of surface properties. A single image,
a stereo pair, or even a short image stream may not hold enough information regarding
the shape or illumination of the scene, however, increasing the input data will only incur
an extensive time penalty which is an established challenge in the field.
Recent state-of-the-art methods address the difficulty of inference in the absence of
1In the present document, the term pipeline refers to a software solution formed of stand-alone modules
or stages. It implies that the flow of execution runs in a single direction, and that each module has the
potential to be used on its own as part of other solutions. Moreover, each module is assumed to take an
input set and output data for the following stage, where each module addresses a single type of problem
only.
data, nonetheless, they do not attempt to solve the challenge of compositing artefacts
between existing scene geometry, or cater for the inclusion of new geometry behind complex
surface materials such as translucent glass or in front of reflective surfaces.
The present work focuses on the compositing in the 3D domain and brings forth a
software framework 2 that contributes solutions to a number of challenges encountered in
the field, including the ability to render physically-accurate soft shadows in the absence
of user annotate scene properties or RGB-D data. Another contribution consists in the
timely manner in which the framework achieves a believable result compared to the other
compositing methods which rely on offline rendering. The availability of proprietary
hardware and user expertise are two of the main factors that are not required in order to
achieve a fast and reliable results within the current framework
Efficient image-based rendering
Recent advancements in real-time ray tracing and deep learning have significantly enhanced the realism of computer-generated images. However, conventional 3D computer graphics (CG) can still be time-consuming and resource-intensive, particularly when creating photo-realistic simulations of complex or animated scenes. Image-based rendering (IBR) has emerged as an alternative approach that utilizes pre-captured images from the real world to generate realistic images in real-time, eliminating the need for extensive modeling. Although IBR has its advantages, it faces challenges in providing the same level of control over scene attributes as traditional CG pipelines and accurately reproducing complex scenes and objects with different materials, such as transparent objects. This thesis endeavors to address these issues by harnessing the power of deep learning and incorporating the fundamental principles of graphics and physical-based rendering. It offers an efficient solution that enables interactive manipulation of real-world dynamic scenes captured from sparse views, lighting positions, and times, as well as a physically-based approach that facilitates accurate reproduction of the view dependency effect resulting from the interaction between transparent objects and their surrounding environment. Additionally, this thesis develops a visibility metric that can identify artifacts in the reconstructed IBR images without observing the reference image, thereby contributing to the design of an effective IBR acquisition pipeline. Lastly, a perception-driven rendering technique is developed to provide high-fidelity visual content in virtual reality displays while retaining computational efficiency.JĂŒngste Fortschritte im Bereich Echtzeit-Raytracing und Deep Learning haben den Realismus computergenerierter Bilder erheblich verbessert. Konventionelle 3DComputergrafik (CG) kann jedoch nach wie vor zeit- und ressourcenintensiv sein, insbesondere bei der Erstellung fotorealistischer Simulationen von komplexen oder animierten Szenen. Das bildbasierte Rendering (IBR) hat sich als alternativer Ansatz herauskristallisiert, bei dem vorab aufgenommene Bilder aus der realen Welt verwendet werden, um realistische Bilder in Echtzeit zu erzeugen, so dass keine umfangreiche Modellierung erforderlich ist. Obwohl IBR seine Vorteile hat, ist es eine Herausforderung, das gleiche MaĂ an Kontrolle ĂŒber Szenenattribute zu bieten wie traditionelle CG-Pipelines und komplexe Szenen und Objekte mit unterschiedlichen Materialien, wie z.B. transparente Objekte, akkurat wiederzugeben. In dieser Arbeit wird versucht, diese Probleme zu lösen, indem die Möglichkeiten des Deep Learning genutzt und die grundlegenden Prinzipien der Grafik und des physikalisch basierten Renderings einbezogen werden. Sie bietet eine effiziente Lösung, die eine interaktive Manipulation von dynamischen Szenen aus der realen Welt ermöglicht, die aus spĂ€rlichen Ansichten, Beleuchtungspositionen und Zeiten erfasst wurden, sowie einen physikalisch basierten Ansatz, der eine genaue Reproduktion des Effekts der SichtabhĂ€ngigkeit ermöglicht, der sich aus der Interaktion zwischen transparenten Objekten und ihrer Umgebung ergibt. DarĂŒber hinaus wird in dieser Arbeit eine Sichtbarkeitsmetrik entwickelt, mit der Artefakte in den rekonstruierten IBR-Bildern identifiziert werden können, ohne das Referenzbild zu betrachten, und die somit zur Entwicklung einer effektiven IBR-Erfassungspipeline beitrĂ€gt. SchlieĂlich wird ein wahrnehmungsgesteuertes Rendering-Verfahren entwickelt, um visuelle Inhalte in Virtual-Reality-Displays mit hoherWiedergabetreue zu liefern und gleichzeitig die Rechenleistung zu erhalten
Computer-assisted animation creation techniques for hair animation and shade, highlight, and shadow
ć¶ćșŠ:æ° ; ć ±ćçȘć·:çČ3062ć· ; ćŠäœăźçšźéĄ:ć棫(ć·„ćŠ) ; æäžćčŽææ„:2010/2/25 ; æ©ć€§ćŠäœèšçȘć·:æ°532
Fast Accurate and Automatic Brushstroke Extraction
Brushstrokes are viewed as the artistâs âhandwritingâ in a painting. In many applications such as style learning and transfer, mimicking painting, and painting authentication, it is highly desired to quantitatively and accurately identify brushstroke characteristics from old mastersâ pieces using computer programs. However, due to the nature of hundreds or thousands of intermingling brushstrokes in the painting, it still remains challenging. This article proposes an efficient algorithm for brush Stroke extraction based on a Deep neural network, i.e., DStroke. Compared to the state-of-the-art research, the main merit of the proposed DStroke is to automatically and rapidly extract brushstrokes from a painting without manual annotation, while accurately approximating the real brushstrokes with high reliability. Herein, recovering the faithful soft transitions between brushstrokes is often ignored by the other methods. In fact, the details of brushstrokes in a master piece of painting (e.g., shapes, colors, texture, overlaps) are highly desired by artists since they hold promise to enhance and extend the artistsâ powers, just like microscopes extend biologistsâ powers. To demonstrate the high efficiency of the proposed DStroke, we perform it on a set of real scans of paintings and a set of synthetic paintings, respectively. Experiments show that the proposed DStroke is noticeably faster and more accurate at identifying and extracting brushstrokes, outperforming the other methods
Fehlerkaschierte Bildbasierte Darstellungsverfahren
Creating photo-realistic images has been one of the major goals in computer graphics since its early days. Instead of modeling the complexity of nature with standard modeling tools, image-based approaches aim at exploiting real-world footage directly,as they are photo-realistic by definition. A drawback of these approaches has always been that the composition or combination of different sources is a non-trivial task, often resulting in annoying visible artifacts. In this thesis we focus on different techniques to diminish visible artifacts when combining multiple images in a common image domain. The results are either novel images, when dealing with the composition task of multiple images, or novel video sequences rendered in real-time, when dealing with video footage from multiple cameras.Fotorealismus ist seit jeher eines der groĂen Ziele in der Computergrafik. Anstatt die KomplexitĂ€t der Natur mit standardisierten Modellierungswerkzeugen nachzubauen, gehen bildbasierte AnsĂ€tze den umgekehrten Weg und verwenden reale Bildaufnahmen zur Modellierung, da diese bereits per Definition fotorealistisch sind. Ein Nachteil dieser Variante ist jedoch, dass die Komposition oder Kombination mehrerer Quellbilder eine nichttriviale Aufgabe darstellt und hĂ€ufig unangenehm auffallende Artefakte im erzeugten Bild nach sich zieht. In dieser Dissertation werden verschiedene AnsĂ€tze verfolgt, um Artefakte zu verhindern oder abzuschwĂ€chen, welche durch die Komposition oder Kombination mehrerer Bilder in einer gemeinsamen BilddomĂ€ne entstehen. Im Ergebnis liefern die vorgestellten Verfahren
neue Bilder oder neue Ansichten einer Bildsammlung oder Videosequenz, je nachdem, ob die jeweilige Aufgabe die Komposition mehrerer Bilder ist oder die Kombination mehrerer Videos verschiedener Kameras darstellt
View-dependent precomputed light transport using non-linear Gaussian function approximations
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2006.Includes bibliographical references (p. 43-46).We propose a real-time method for rendering rigid objects with complex view-dependent effects under distant all-frequency lighting. Existing precomputed light transport approaches can render rich global illumination effects, but high-frequency view-dependent effects such as sharp highlights remain a challenge. We introduce a new representation of the light transport operator based on sums of Gaussians. The non-linear parameters of the representation allow for 1) arbitrary bandwidth because scale is encoded as a direct parameter; and 2) high-quality interpolation across view and mesh triangles because we interpolate the average direction of the incoming light, thereby preventing linear cross-fading artifacts. However, fitting the precomputed light transport data to this new representation requires solving a non-linear regression problem that is more involved than traditional linear and non-linear (truncation) approximation techniques. We present a new data fitting method based on optimization that includes energy terms aimed at enforcing good interpolation. We demonstrate that our method achieves high visual quality for a small storage cost and fast rendering time.by Paul Elijah Green.S.M
Reconstruction and rendering of time-varying natural phenomena
While computer performance increases and computer generated images get ever more realistic, the need for modeling computer graphics content is becoming stronger. To achieve photo-realism detailed scenes have to be modeled often with a significant amount of manual labour. Interdisciplinary research combining the fields of Computer Graphics, Computer Vision and Scientific Computing has led to the development of (semi-)automatic modeling tools freeing the user of labour-intensive modeling tasks. The modeling of animated content is especially challenging. Realistic motion is necessary to convince the audience of computer games, movies with mixed reality content and augmented reality applications. The goal of this thesis is to investigate automated modeling techniques for time-varying natural phenomena. The results of the presented methods are animated, three-dimensional computer models of fire, smoke and fluid flows.Durch die steigende RechenkapazitĂ€t moderner Computer besteht die Möglichkeit immer realistischere Bilder virtuell zu erzeugen. Dadurch entsteht ein gröĂerer Bedarf an Modellierungsarbeit um die nötigen Objekte virtuell zu beschreiben. Um photorealistische Bilder erzeugen zu können mĂŒssen sehr detaillierte Szenen, oft in mĂŒhsamer Handarbeit, modelliert werden. Ein interdisziplinĂ€rer Forschungszweig, der Computergrafik, Bildverarbeitung und Wissenschaftliches Rechnen verbindet, hat in den letzten Jahren die Entwicklung von (semi-)automatischen Methoden zur Modellierung von Computergrafikinhalten vorangetrieben. Die Modellierung dynamischer Inhalte ist dabei eine besonders anspruchsvolle Aufgabe, da realistische BewegungsablĂ€ufe sehr wichtig fĂŒr eine ĂŒberzeugende Darstellung von Computergrafikinhalten in Filmen, Computerspielen oder Augmented-Reality Anwendungen sind. Das Ziel dieser Arbeit ist es automatische Modellierungsmethoden fĂŒr dynamische Naturerscheinungen wie Wasserfluss, Feuer, Rauch und die Bewegung erhitzter Luft zu entwickeln. Das Resultat der entwickelten Methoden sind dabei dynamische, dreidimensionale Computergrafikmodelle
- âŠ