Search CORE

769 research outputs found

Realistic Visualization of Animated Virtual Cloth

Author: Sattler Mirko
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Photo-realistic rendering of real-world objects is a broad research area with applications in various different areas, such as computer generated films, entertainment, e-commerce and so on. Within photo-realistic rendering, the rendering of cloth is a subarea which involves many important aspects, ranging from material surface reflection properties and macroscopic self-shadowing to animation sequence generation and compression. In this thesis, besides an introduction to the topic plus a broad overview of related work, different methods to handle major aspects of cloth rendering are described. Material surface reflection properties play an important part to reproduce the look & feel of materials, that is, to identify a material only by looking at it. The BTF (bidirectional texture function), as a function of viewing and illumination direction, is an appropriate representation of reflection properties. It captures effects caused by the mesostructure of a surface, like roughness, self-shadowing, occlusion, inter-reflections, subsurface scattering and color bleeding. Unfortunately a BTF data set of a material consists of hundreds to thousands of images, which exceeds current memory size of personal computers by far. This work describes the first usable method to efficiently compress and decompress a BTF data for rendering at interactive to real-time frame rates. It is based on PCA (principal component analysis) of the BTF data set. While preserving the important visual aspects of the BTF, the achieved compression rates allow the storage of several different data sets in main memory of consumer hardware, while maintaining a high rendering quality. Correct handling of complex illumination conditions plays another key role for the realistic appearance of cloth. Therefore, an upgrade of the BTF compression and rendering algorithm is described, which allows the support of distant direct HDR (high-dynamic-range) illumination stored in environment maps. To further enhance the appearance, macroscopic self-shadowing has to be taken into account. For the visualization of folds and the life-like 3D impression, these kind of shadows are absolutely necessary. This work describes two methods to compute these shadows. The first is seamlessly integrated into the illumination part of the rendering algorithm and optimized for static meshes. Furthermore, another method is proposed, which allows the handling of dynamic objects. It uses hardware-accelerated occlusion queries for the visibility determination. In contrast to other algorithms, the presented algorithm, despite its simplicity, is fast and produces less artifacts than other methods. As a plus, it incorporates changeable distant direct high-dynamic-range illumination. The human perception system is the main target of any computer graphics application and can also be treated as part of the rendering pipeline. Therefore, optimization of the rendering itself can be achieved by analyzing human perception of certain visual aspects in the image. As a part of this thesis, an experiment is introduced that evaluates human shadow perception to speedup shadow rendering and provides optimization approaches. Another subarea of cloth visualization in computer graphics is the animation of the cloth and avatars for presentations. This work also describes two new methods for automatic generation and compression of animation sequences. The first method to generate completely new, customizable animation sequences, is based on the concept of finding similarities in animation frames of a given basis sequence. Identifying these similarities allows jumps within the basis sequence to generate endless new sequences. Transmission of any animated 3D data over bandwidth-limited channels, like extended networks or to less powerful clients requires efficient compression schemes. The second method included in this thesis in the animation field is a geometry data compression scheme. Similar to the BTF compression, it uses PCA in combination with clustering algorithms to segment similar moving parts of the animated objects to achieve high compression rates in combination with a very exact reconstruction quality.Realistische Visualisierung von animierter virtueller Kleidung Das photorealistisches Rendering realer Gegenstände ist ein weites Forschungsfeld und hat Anwendungen in vielen Bereichen. Dazu zählen Computer generierte Filme (CGI), die Unterhaltungsindustrie und E-Commerce. Innerhalb dieses Forschungsbereiches ist das Rendern von photorealistischer Kleidung ein wichtiger Bestandteil. Hier reichen die wichtigen Aspekte, die es zu berücksichtigen gilt, von optischen Materialeigenschaften über makroskopische Selbstabschattung bis zur Animationsgenerierung und -kompression. In dieser Arbeit wird, neben der Einführung in das Thema, ein weiter Überblick über ähnlich gelagerte Arbeiten gegeben. Der Schwerpunkt der Arbeit liegt auf den wichtigen Aspekten der virtuellen Kleidungsvisualisierung, die oben beschrieben wurden. Die optischen Reflektionseigenschaften von Materialoberflächen spielen eine wichtige Rolle, um das so genannte look & feel von Materialien zu charakterisieren. Hierbei kann ein Material vom Nutzer identifiziert werden, ohne dass er es direkt anfassen muss. Die BTF (bidirektionale Texturfunktion)ist eine Funktion die abhängig von der Blick- und Beleuchtungsrichtung ist. Daher ist sie eine angemessene Repräsentation von Reflektionseigenschaften. Sie enthält Effekte wie Rauheit, Selbstabschattungen, Verdeckungen, Interreflektionen, Streuung und Farbbluten, die durch die Mesostruktur der Oberfläche hervorgerufen werden. Leider besteht ein BTF Datensatz eines Materials aus hunderten oder tausenden von Bildern und sprengt damit herkömmliche Hauptspeicher in Computern bei weitem. Diese Arbeit beschreibt die erste praktikable Methode, um BTF Daten effizient zu komprimieren, zu speichern und für Echtzeitanwendungen zum Visualisieren wieder zu dekomprimieren. Die Methode basiert auf der Principal Component Analysis (PCA), die Daten nach Signifikanz ordnet. Während die PCA die entscheidenen visuellen Aspekte der BTF erhält, können mit ihrer Hilfe Kompressionsraten erzielt werden, die es erlauben mehrere BTF Materialien im Hauptspeicher eines Consumer PC zu verwalten. Dies erlaubt ein High-Quality Rendering. Korrektes Verwenden von komplexen Beleuchtungssituationen spielt eine weitere, wichtige Rolle, um Kleidung realistisch erscheinen zu lassen. Daher wird zudem eine Erweiterung des BTF Kompressions- und Renderingalgorithmuses erläutert, die den Einsatz von High-Dynamic Range (HDR) Beleuchtung erlaubt, die in environment maps gespeichert wird. Um die realistische Erscheinung der Kleidung weiter zu unterstützen, muss die makroskopische Selbstabschattung integriert werden. Für die Visualisierung von Falten und den lebensechten 3D Eindruck ist diese Art von Schatten absolut notwendig. Diese Arbeit beschreibt daher auch zwei Methoden, diese Schatten schnell und effizient zu berechnen. Die erste ist nahtlos in den Beleuchtungspart des obigen BTF Renderingalgorithmuses integriert und für statische Geometrien optimiert. Die zweite Methode behandelt dynamische Objekte. Dazu werden hardwarebeschleunigte Occlusion Queries verwendet, um die Sichtbarkeitsberechnung durchzuführen. Diese Methode ist einerseits simpel und leicht zu implementieren, anderseits ist sie schnell und produziert weniger Artefakte, als vergleichbare Methoden. Zusätzlich ist die Verwendung von veränderbarer, entfernter HDR Beleuchtung integriert. Das menschliche Wahrnehmungssystem ist das eigentliche Ziel jeglicher Anwendung in der Computergrafik und kann daher selbst als Teil einer erweiterten Rendering Pipeline gesehen werden. Daher kann das Rendering selbst optimiert werden, wenn man die menschliche Wahrnehmung verschiedener visueller Aspekte der berechneten Bilder analysiert. Teil der vorliegenden Arbeit ist die Beschreibung eines Experimentes, das menschliche Schattenwahrnehmung untersucht, um das Rendern der Schatten zu beschleunigen. Ein weiteres Teilgebiet der Kleidungsvisualisierung in der Computergrafik ist die Animation der Kleidung und von Avataren für Präsentationen. Diese Arbeit beschreibt zwei neue Methoden auf diesem Teilgebiet. Einmal ein Algorithmus, der für die automatische Generierung neuer Animationssequenzen verwendet werden kann und zum anderen einen Kompressionsalgorithmus für eben diese Sequenzen. Die automatische Generierung von völlig neuen, anpassbaren Animationen basiert auf dem Konzept der Ähnlichkeitssuche. Hierbei werden die einzelnen Schritte von gegebenen Basisanimationen auf Ähnlichkeiten hin untersucht, die zum Beispiel die Geschwindigkeiten einzelner Objektteile sein können. Die Identifizierung dieser Ähnlichkeiten erlaubt dann Sprünge innerhalb der Basissequenz, die dazu benutzt werden können, endlose, neue Sequenzen zu erzeugen. Die Übertragung von animierten 3D Daten über bandbreitenlimitierte Kanäle wie ausgedehnte Netzwerke, Mobilfunk oder zu sogenannten thin clients erfordert eine effiziente Komprimierung. Die zweite, in dieser Arbeit vorgestellte Methode, ist ein Kompressionsschema für Geometriedaten. Ähnlich wie bei der Kompression von BTF Daten wird die PCA in Verbindung mit Clustering benutzt, um die animierte Geometrie zu analysieren und in sich ähnlich bewegende Teile zu segmentieren. Diese erkannten Segmente lassen sich dann hoch komprimieren. Der Algorithmus arbeitet automatisch und erlaubt zudem eine sehr exakte Rekonstruktionsqualität nach der Dekomprimierung

bonndoc – Der Publikationsserver der Universität Bonn

Calipso: Physics-based Image and Video Editing through CAD Model Proxies

Author: Cotin Stephane
Courtecuisse Hadrien
Haouchine Nazim
Nießner Matthias
Roy Frederick
Publication venue
Publication date: 12/08/2017
Field of study

We present Calipso, an interactive method for editing images and videos in a physically-coherent manner. Our main idea is to realize physics-based manipulations by running a full physics simulation on proxy geometries given by non-rigidly aligned CAD models. Running these simulations allows us to apply new, unseen forces to move or deform selected objects, change physical parameters such as mass or elasticity, or even add entire new objects that interact with the rest of the underlying scene. In Calipso, the user makes edits directly in 3D; these edits are processed by the simulation and then transfered to the target 2D content using shape-to-image correspondences in a photo-realistic rendering process. To align the CAD models, we introduce an efficient CAD-to-image alignment procedure that jointly minimizes for rigid and non-rigid alignment while preserving the high-level structure of the input shape. Moreover, the user can choose to exploit image flow to estimate scene motion, producing coherent physical behavior with ambient dynamics. We demonstrate Calipso's physics-based editing on a wide range of examples producing myriad physical behavior while preserving geometric and visual consistency.Comment: 11 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Prediction for Projection on Time-Varying Surfaces

Author: Gomes Adam Daniel
Publication venue: 'University of Waterloo'
Publication date: 19/09/2016
Field of study

In spatial augmented reality applications, when video projectors display images on time-varying, non-planar surfaces, rather than on flat, rigid surfaces, undesired image distortion may occur. For applications where realism is of the utmost importance, such as surgical simulations, image distortion can significantly detract from the user experience. To combat this, the time-varying surface can be modelled using a mass-spring model, commonly used for simulating deformable objects in computer graphics. The mass-spring model can be formulated into a nonlinear state space equation that describes the dynamics of discrete points making up the surface of the object. Two simulation techniques are used to verify the model and to determine the best approach for real-time simulations. To project images in real-time onto quickly changing surfaces, an extended Kalman filter (EKF) prediction algorithm is developed to predict the position of the deforming surface, at a specified point in time, T_s seconds, in the future. Using the linearized mass-spring system, the EKF is formulated and tested upon two simulation scenarios. The simulation scenarios include a falling cloth with added process noise, and a cloth perturbed by random viscous forces. Using mean squared error, the results show the EKF predictions and simulation outputs converge within a narrow band. For each scenario, the parameters of the EKF are manually tuned to improve the accuracy of the predictions. Experimental data is collected by measuring the movement of cloth-like materials to verify the effectiveness of the prediction algorithm. Specifically, cloth movement data is captured using infra-red markers and motion capture software. The EKF prediction algorithm is run on the experimental data producing near convergent results between the predictions and the measurements. When the physical surface is changing noticeably and quickly, compared to the projector's drawing rate, additional distortion may occur. An inter-frame prediction algorithm is developed to further predict the position of discrete points at their corresponding projection times. This is most useful when the prediction algorithm produces predictions slower than the drawing rate of the projector (T_s>1/fps). When implementing the EKF in real-time, there is a trade-off between speed and accuracy. If the number of discrete points is large, the EKF is required to solve a large system of equations. To combat this, nonlinear optimization techniques are used to find parameters that reduce the number of states while maintaining system dynamics. This results in a sparser, more computationally efficient model with similar physical behaviour to the original system. Applications for time-varying surface prediction include surgical simulations, projection for entertainment and advertising, and other spatial augmented reality applications

University of Waterloo's Institutional Repository

Impact of Soft Tissue Heterogeneity on Augmented Reality for Liver Surgery

Author: Berger Marie-Odile
Cotin Stephane
Dequidt Jeremie
Haouchine Nazim
Kerrien Erwan
Peterlik Igor
Sanz Lopez Mario
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

International audienceThis paper presents a method for real-time augmented reality of internal liver structures during minimally invasive hepatic surgery. Vessels and tumors computed from pre-operative CT scans can be overlaid onto the laparoscopic view for surgery guidance. Compared to current methods, our method is able to locate the in-depth positions of the tumors based on partial three-dimensional liver tissue motion using a real-time biomechanical model. This model permits to properly handle the motion of internal structures even in the case of anisotropic or heterogeneous tissues, as it is the case for the liver and many anatomical structures. Experimentations conducted on phantom liver permits to measure the accuracy of the augmentation while real-time augmentation on in vivo human liver during real surgery shows the benefits of such an approach for minimally invasive surgery

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Template based shape processing

Author: Stoll Carsten
Publication venue
Publication date: 06/05/2010
Field of study

As computers can only represent and process discrete data, information gathered from the real world always has to be sampled. While it is nowadays possible to sample many signals accurately and thus generate high-quality reconstructions (for example of images and audio data), accurately and densely sampling 3D geometry is still a challenge. The signal samples may be corrupted by noise and outliers, and contain large holes due to occlusions. These issues become even more pronounced when also considering the temporal domain. Because of this, developing methods for accurate reconstruction of shapes from a sparse set of discrete data is an important aspect of the computer graphics processing pipeline. In this thesis we propose novel approaches to including semantic knowledge into reconstruction processes using template based shape processing. We formulate shape reconstruction as a deformable template fitting process, where we try to fit a given template model to the sampled data. This approach allows us to present novel solutions to several fundamental problems in the area of shape reconstruction. We address static problems like constrained texture mapping and semantically meaningful hole-filling in surface reconstruction from 3D scans, temporal problems such as mesh based performance capture, and finally dynamic problems like the estimation of physically based material parameters of animated templates.Analoge Signale müssen digitalisiert werden um sie auf modernen Computern speichern und verarbeiten zu können. Für viele Signale, wie zum Beispiel Bilder oder Tondaten, existieren heutzutage effektive und effiziente Digitalisierungstechniken. Aus den so gewonnenen Daten können die ursprünglichen Signale hinreichend akkurat wiederhergestellt werden. Im Gegensatz dazu stellt das präzise und effiziente Digitalisieren und Rekonstruieren von 3D- oder gar 4D-Geometrie immer noch eine Herausforderung dar. So führen Verdeckungen und Fehler während der Digitalisierung zu Löchern und verrauschten Meßdaten. Die Erforschung von akkuraten Rekonstruktionsmethoden für diese groben digitalen Daten ist daher ein entscheidender Schritt in der Entwicklung moderner Verarbeitungsmethoden in der Computergrafik. In dieser Dissertation wird veranschaulicht, wie deformierbare geometrische Modelle als Vorlage genutzt werden können, um semantische Informationen in die robuste Rekonstruktion von 3D- und 4D Geometrie einfließen zu lassen. Dadurch wird es möglich, neue Lösungsansätze für mehrere grundlegenden Probleme der Computergrafik zu entwickeln. So können mit dieser Technik Löcher in digitalisierten 3D Modellen semantisch sinnvoll aufgefüllt, oder detailgetreue virtuelle Kopien von Darstellern und ihrer dynamischen Kleidung zu erzeugt werden

Acronym

Color-aware surface registration

Author: Lai Yukun
Martin Ralph Robert
Shiyao Jin
Shuai Lin
Zhiquan Cheng
Publication venue: 'Elsevier BV'
Publication date: 01/08/2016
Field of study

Shape registration is fundamental to 3D object acquisition; it is used to fuse scans from multiple views. Existing algorithms mainly utilize geometric information to determine alignment, but this typically results in noticeable misalignment of textures (i.e. surface colors) when using RGB-depth cameras. We address this problem using a novel approach to color-aware registration, which takes both color and geometry into consideration simultaneously. Color information is exploited throughout the pipeline to provide more effective sampling, correspondence and alignment, in particular for surfaces with detailed textures. Our method can furthermore tackle both rigid and non-rigid registration problems (arising, for example, due to small changes in the object during scanning, or camera distortions). We demonstrate that our approach produces significantly better results than previous methods

Online Research @ Cardiff

Automatic tailoring and cloth modelling for animation characters.

Author: Li Wenxi
Publication venue
Publication date: 01/06/2014
Field of study

The construction of realistic characters has become increasingly important to the production of blockbuster films, TV series and computer games. The outfit of character plays an important role in the application of virtual characters. It is one of the key elements reflects the personality of character. Virtual clothing refers to the process that constructs outfits for virtual characters, and currently, it is widely used in mainly two areas, fashion industry and computer animation. In fashion industry, virtual clothing technology is an effective tool which creates, edits and pre-visualises cloth design patterns efficiently. However, using this method requires lots of tailoring expertises. In computer animation, geometric modelling methods are widely used for cloth modelling due to their simplicity and intuitiveness. However, because of the shortage of tailoring knowledge among animation artists, current existing cloth design patterns can not be used directly by animation artists, and the appearance of cloth depends heavily on the skill of artists. Moreover, geometric modelling methods requires lots of manual operations. This tediousness is worsen by modelling same style cloth for different characters with different body shapes and proportions. This thesis addresses this problem and presents a new virtual clothing method which includes automatic character measuring, automatic cloth pattern adjustment, and cloth patterns assembling. There are two main contributions in this research. Firstly, a geodesic curvature flow based geodesic computation scheme is presented for acquiring length measurements from character. Due to the fast growing demand on usage of high resolution character model in animation production, the increasing number of characters need to be handled simultaneously as well as improving the reusability of 3D model in film production, the efficiency of modelling cloth for multiple high resolution character is very important. In order to improve the efficiency of measuring character for cloth fitting, a fast geodesic algorithm that has linear time complexity with a small bounded error is also presented. Secondly, a cloth pattern adjusting genetic algorithm is developed for automatic cloth fitting and retargeting. For the reason that that body shapes and proportions vary largely in character design, fitting and transferring cloth to a different character is a challenging task. This thesis considers the cloth fitting process as an optimization procedure. It optimizes both the shape and size of each cloth pattern automatically, the integrity, design and size of each cloth pattern are evaluated in order to create 3D cloth for any character with different body shapes and proportions while preserve the original cloth design. By automating the cloth modelling process, it empowers the creativity of animation artists and improves their productivity by allowing them to use a large amount of existing cloth design patterns in fashion industry to create various clothes and to transfer same design cloth to characters with different body shapes and proportions with ease

Bournemouth University Research Online

Virtuaalse proovikabiini 3D kehakujude ja roboti juhtimisalgoritmide uurimine

Author: Daneshmand Morteza
Publication venue
Publication date: 24/04/2018
Field of study

Väitekirja elektrooniline versioon ei sisalda publikatsiooneVirtuaalne riiete proovimine on üks põhilistest teenustest, mille pakkumine võib suurendada rõivapoodide edukust, sest tänu sellele lahendusele väheneb füüsilise töö vajadus proovimise faasis ning riiete proovimine muutub kasutaja jaoks mugavamaks. Samas pole enamikel varem välja pakutud masinnägemise ja graafika meetoditel õnnestunud inimkeha realistlik modelleerimine, eriti terve keha 3D modelleerimine, mis vajab suurt kogust andmeid ja palju arvutuslikku ressurssi. Varasemad katsed on ebaõnnestunud põhiliselt seetõttu, et ei ole suudetud korralikult arvesse võtta samaaegseid muutusi keha pinnal. Lisaks pole varasemad meetodid enamasti suutnud kujutiste liikumisi realistlikult reaalajas visualiseerida. Käesolev projekt kavatseb kõrvaldada eelmainitud puudused nii, et rahuldada virtuaalse proovikabiini vajadusi. Välja pakutud meetod seisneb nii kasutaja keha kui ka riiete skaneerimises, analüüsimises, modelleerimises, mõõtmete arvutamises, orientiiride paigutamises, mannekeenidelt võetud 3D visuaalsete andmete segmenteerimises ning riiete mudeli paigutamises ja visualiseerimises kasutaja kehal. Selle projekti käigus koguti visuaalseid andmeid kasutades 3D laserskannerit ja Kinecti optilist kaamerat ning koostati nendest andmebaas. Neid andmeid kasutati välja töötatud algoritmide testimiseks, mis peamiselt tegelevad riiete realistliku visuaalse kujutamisega inimkehal ja suuruse pakkumise süsteemi täiendamisega virtuaalse proovikabiini kontekstis.Virtual fitting constitutes a fundamental element of the developments expected to rise the commercial prosperity of online garment retailers to a new level, as it is expected to reduce the load of the manual labor and physical efforts required. Nevertheless, most of the previously proposed computer vision and graphics methods have failed to accurately and realistically model the human body, especially, when it comes to the 3D modeling of the whole human body. The failure is largely related to the huge data and calculations required, which in reality is caused mainly by inability to properly account for the simultaneous variations in the body surface. In addition, most of the foregoing techniques cannot render realistic movement representations in real-time. This project intends to overcome the aforementioned shortcomings so as to satisfy the requirements of a virtual fitting room. The proposed methodology consists in scanning and performing some specific analyses of both the user's body and the prospective garment to be virtually fitted, modeling, extracting measurements and assigning reference points on them, and segmenting the 3D visual data imported from the mannequins. Finally, superimposing, adopting and depicting the resulting garment model on the user's body. The project is intended to gather sufficient amounts of visual data using a 3D laser scanner and the Kinect optical camera, to manage it in form of a usable database, in order to experimentally implement the algorithms devised. The latter will provide a realistic visual representation of the garment on the body, and enhance the size-advisor system in the context of the virtual fitting room under study

DSpace at Tartu University Library

Real-Time Implementation of Time-Varying Surface Prediction and Projection

Author: Fernandes Keegan Aaron
Publication venue: 'University of Waterloo'
Publication date: 22/04/2019
Field of study

Spatial augmented reality makes use of projectors to transform an object into a display surface. However, for time-varying, non-rigid surfaces this can prove to be difficult, and often leads to image distortion. In order to avoid this highly accurate measurements of the surface are required. Traditional methods of measuring surface deformations are inadequate due to noise as well as potential sources of time delay, such as projector lag. To get more accurate results, a mass spring model can be used to simulate the dynamics of the time-varying surface. This model can be put into a nonlinear state space form to get a first order differential equation. Numerical integration techniques can then be used to solve the differential equation presented. In order to reduce uncertainty in the model generated a filtering algorithm can be used. Both, the extended Kalman filter (EKF) and the cubature Kalman filter (CKF) are evaluated as potential candidates. To be able to run these filters in real time a reduced order model is developed. This enables the use of fewer mass nodes in the model, allowing for faster compute times. Additionally, to reduce visual error, an optimal node placement algorithm is used. This ensures that the surface generated by the mass spring mesh closely matches the real, curved surface of the system, minimizing error. The EKF and CKF algorithms are implemented onto a hanging cloth system perturbed by an oscillating fan. A parameter identification technique is used to create a model that accurately represents this hanging cloth system. Additionally, noise parameters of the EKF and CKF are adjusted to compensate for modeling errors and sensor noise. Finally, The mean squared error of the EKF and CKF algorithms are compared to evaluate their effectiveness. Both algorithms provide satisfactory results for use in spatial augmented reality applications. However, in all cases tested the CKF is shown to have significantly lower error values. Although the CKF algorithm is shown to be more accurate than its EKF counterpart, its computation time is much larger. However, the computation time required is still within the threshold of being able to perform real-time estimation at up to 100Hz. Furthermore, due to the nature of the construction of the CKF, it can be applied as a multi-threaded workload to significantly reduce computation time. Therefore, the implementation of a CKF algorithm can be used to accurately estimate the positions of a measured surface for use in spatial augmented reality

University of Waterloo's Institutional Repository

Bridging the gap between reconstruction and synthesis

Author: Pumarola Peris Albert
Publication venue
Publication date: 13/10/2021
Field of study

Aplicat embargament des de la data de defensa fins el 15 de gener de 20223D reconstruction and image synthesis are two of the main pillars in computer vision. Early works focused on simple tasks such as multi-view reconstruction and texture synthesis. With the spur of Deep Learning, the field has rapidly progressed, making it possible to achieve more complex and high level tasks. For example, the 3D reconstruction results of traditional multi-view approaches are currently obtained with single view methods. Similarly, early pattern based texture synthesis works have resulted in techniques that allow generating novel high-resolution images. In this thesis we have developed a hierarchy of tools that cover all these range of problems, lying at the intersection of computer vision, graphics and machine learning. We tackle the problem of 3D reconstruction and synthesis in the wild. Importantly, we advocate for a paradigm in which not everything should be learned. Instead of applying Deep Learning naively we propose novel representations, layers and architectures that directly embed prior 3D geometric knowledge for the task of 3D reconstruction and synthesis. We apply these techniques to problems including scene/person reconstruction and photo-realistic rendering. We first address methods to reconstruct a scene and the clothed people in it while estimating the camera position. Then, we tackle image and video synthesis for clothed people in the wild. Finally, we bridge the gap between reconstruction and synthesis under the umbrella of a unique novel formulation. Extensive experiments conducted along this thesis show that the proposed techniques improve the performance of Deep Learning models in terms of the quality of the reconstructed 3D shapes / synthesised images, while reducing the amount of supervision and training data required to train them. In summary, we provide a variety of low, mid and high level algorithms that can be used to incorporate prior knowledge into different stages of the Deep Learning pipeline and improve performance in tasks of 3D reconstruction and image synthesis.La reconstrucció 3D i la síntesi d'imatges són dos dels pilars fonamentals en visió per computador. Els estudis previs es centren en tasques senzilles com la reconstrucció amb informació multi-càmera i la síntesi de textures. Amb l'aparició del "Deep Learning", aquest camp ha progressat ràpidament, fent possible assolir tasques molt més complexes. Per exemple, per obtenir una reconstrucció 3D, tradicionalment s'utilitzaven mètodes multi-càmera, en canvi ara, es poden obtenir a partir d'una sola imatge. De la mateixa manera, els primers treballs de síntesi de textures basats en patrons han donat lloc a tècniques que permeten generar noves imatges completes en alta resolució. En aquesta tesi, hem desenvolupat una sèrie d'eines que cobreixen tot aquest ventall de problemes, situats en la intersecció entre la visió per computador, els gràfics i l'aprenentatge automàtic. Abordem el problema de la reconstrucció i la síntesi 3D en el món real. És important destacar que defensem un paradigma on no tot s'ha d'aprendre. Enlloc d'aplicar el "Deep Learning" de forma naïve, proposem representacions novedoses i arquitectures que incorporen directament els coneixements geomètrics ja existents per a aconseguir la reconstrucció 3D i la síntesi d'imatges. Nosaltres apliquem aquestes tècniques a problemes com ara la reconstrucció d'escenes/persones i a la renderització d'imatges fotorealistes. Primer abordem els mètodes per reconstruir una escena, les persones vestides que hi ha i la posició de la càmera. A continuació, abordem la síntesi d'imatges i vídeos de persones vestides en situacions quotidianes. I finalment, aconseguim, a través d'una nova formulació única, connectar la reconstrucció amb la síntesi. Els experiments realitzats al llarg d'aquesta tesi demostren que les tècniques proposades milloren el rendiment dels models de "Deepp Learning" pel que fa a la qualitat de les reconstruccions i les imatges sintetitzades alhora que redueixen la quantitat de dades necessàries per entrenar-los. En resum, proporcionem una varietat d'algoritmes de baix, mitjà i alt nivell que es poden utilitzar per incorporar els coneixements previs a les diferents etapes del "Deep Learning" i millorar el rendiment en tasques de reconstrucció 3D i síntesi d'imatges.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa