5,032 research outputs found

    Practical intrinsic image decomposition

    Get PDF
    El conocimiento previo de las luces y los materiales que componen una escena es el primer paso para su total captura y reconstrucción. Sin embargo, obtener esta información a partir de una sencilla fotografía no es una tarea fácil. Cuando capturamos una imagen del mundo real, toda la información de color, geometría e iluminación se integra en el sensor de nuestra cámara dando como resultado un conjunto de píxeles RGB. Estos valores carecen de toda la información geométrica de la imagen que nos permitiría realizar tareas como reiluminación o cambio de materiales. El objetivo de la presente Tesis Fin de Máster ha sido estudiar y resolver este problema que comúnmente se conoce como descomposición de una imagen en sus componentes intrínsecas, y que consiste en obtener, para una única imagen, la parte correspondiente a iluminación y la que corresponde con reflectancia (textura, color). Actualmente, la mayoría de los métodos que resuelven este problema requieren excesiva interacción del usuario. De este modo, un usuario inexperto o la ausencia de información pueden dar lugar a malas descomposiciones. En este trabajo se ha tratado de obtener una solución eficiente, con resultados de alta calidad y robustos, partiendo de una única imagen de la escena a descomponer. En particular, se han estudiado dos soluciones distintas. La primera solución propuesta, denominada Intrinsic Images by Clustering, ha sido publicada en la revista Computer Graphics Forum cuyo JCR 2011 index es 35/83 (Q2) en la categoría de Computer Science, Software Engineering, con un índice de impacto de 5 años de 1.634. El método propuesto requiere una única imagen para funcionar y se basa en la detección en la imagen de zonas de la misma reflectancia. Con esta información se construye un sistema de ecuaciones lineales donde se describen las conexiones y las relaciones entre ellas. Este algoritmo constituye el actual estado del arte en métodos de separación en imágenes intrínsecas a partir de una sola imagen de entrada. La segunda solución planteada ha sido desarrollada en colaboración con la empresa Adobe Systems Inc. bajo la supervisión del Dr. Sunil Hadap. El nuevo método se basa en la observación de que los gradientes de reflectancia de la imagen siguen una dirección invariante y relativa a la fuente de luz. De este modo, estimando la dirección invariante a partir de la información de color de la imagen, podríamos ser capaces de desambiguar los cambios debidos a reflectancia y los cambios debidos sombreado. Los resultados obtenidos de este primer estudio del algoritmo bajo un entorno controlado concluyen que el algoritmo tiene mucho potencial, y se abre una interesante vía para futuras investigaciones mediante la combinación con otras técnicas complementarias que aporten nueva información de la escena. Descomponer una imagen en sus componentes intrínsecas es todavía un problema abierto con múltiples aplicaciones potenciales. Con esta investigación se ha contribuido con un paso más hacia la solución global y óptima. Además, se concluye que futuras investigaciones deberían enfocarse a obtener un algoritmo que requiera la menor interacción posible, ya que debido a la complejidad del problema es matemáticamente imposible obtener una solución única y sin interacción para todos los escenarios

    Learning geometric and lighting priors from natural images

    Get PDF
    Comprendre les images est d’une importance cruciale pour une pléthore de tâches, de la composition numérique au ré-éclairage d’une image, en passant par la reconstruction 3D d’objets. Ces tâches permettent aux artistes visuels de réaliser des chef-d’oeuvres ou d’aider des opérateurs à prendre des décisions de façon sécuritaire en fonction de stimulis visuels. Pour beaucoup de ces tâches, les modèles physiques et géométriques que la communauté scientifique a développés donnent lieu à des problèmes mal posés possédant plusieurs solutions, dont généralement une seule est raisonnable. Pour résoudre ces indéterminations, le raisonnement sur le contexte visuel et sémantique d’une scène est habituellement relayé à un artiste ou un expert qui emploie son expérience pour réaliser son travail. Ceci est dû au fait qu’il est généralement nécessaire de raisonner sur la scène de façon globale afin d’obtenir des résultats plausibles et appréciables. Serait-il possible de modéliser l’expérience à partir de données visuelles et d’automatiser en partie ou en totalité ces tâches ? Le sujet de cette thèse est celui-ci : la modélisation d’a priori par apprentissage automatique profond pour permettre la résolution de problèmes typiquement mal posés. Plus spécifiquement, nous couvrirons trois axes de recherche, soient : 1) la reconstruction de surface par photométrie, 2) l’estimation d’illumination extérieure à partir d’une seule image et 3) l’estimation de calibration de caméra à partir d’une seule image avec un contenu générique. Ces trois sujets seront abordés avec une perspective axée sur les données. Chacun de ces axes comporte des analyses de performance approfondies et, malgré la réputation d’opacité des algorithmes d’apprentissage machine profonds, nous proposons des études sur les indices visuels captés par nos méthodes.Understanding images is needed for a plethora of tasks, from compositing to image relighting, including 3D object reconstruction. These tasks allow artists to realize masterpieces or help operators to safely make decisions based on visual stimuli. For many of these tasks, the physical and geometric models that the scientific community has developed give rise to ill-posed problems with several solutions, only one of which is generally reasonable. To resolve these indeterminations, the reasoning about the visual and semantic context of a scene is usually relayed to an artist or an expert who uses his experience to carry out his work. This is because humans are able to reason globally on the scene in order to obtain plausible and appreciable results. Would it be possible to model this experience from visual data and partly or totally automate tasks? This is the topic of this thesis: modeling priors using deep machine learning to solve typically ill-posed problems. More specifically, we will cover three research axes: 1) surface reconstruction using photometric cues, 2) outdoor illumination estimation from a single image and 3) camera calibration estimation from a single image with generic content. These three topics will be addressed from a data-driven perspective. Each of these axes includes in-depth performance analyses and, despite the reputation of opacity of deep machine learning algorithms, we offer studies on the visual cues captured by our methods

    Artistic Path Space Editing of Physically Based Light Transport

    Get PDF
    Die Erzeugung realistischer Bilder ist ein wichtiges Ziel der Computergrafik, mit Anwendungen u.a. in der Spielfilmindustrie, Architektur und Medizin. Die physikalisch basierte Bildsynthese, welche in letzter Zeit anwendungsübergreifend weiten Anklang findet, bedient sich der numerischen Simulation des Lichttransports entlang durch die geometrische Optik vorgegebener Ausbreitungspfade; ein Modell, welches für übliche Szenen ausreicht, Photorealismus zu erzielen. Insgesamt gesehen ist heute das computergestützte Verfassen von Bildern und Animationen mit wohlgestalteter und theoretisch fundierter Schattierung stark vereinfacht. Allerdings ist bei der praktischen Umsetzung auch die Rücksichtnahme auf Details wie die Struktur des Ausgabegeräts wichtig und z.B. das Teilproblem der effizienten physikalisch basierten Bildsynthese in partizipierenden Medien ist noch weit davon entfernt, als gelöst zu gelten. Weiterhin ist die Bildsynthese als Teil eines weiteren Kontextes zu sehen: der effektiven Kommunikation von Ideen und Informationen. Seien es nun Form und Funktion eines Gebäudes, die medizinische Visualisierung einer Computertomografie oder aber die Stimmung einer Filmsequenz -- Botschaften in Form digitaler Bilder sind heutzutage omnipräsent. Leider hat die Verbreitung der -- auf Simulation ausgelegten -- Methodik der physikalisch basierten Bildsynthese generell zu einem Verlust intuitiver, feingestalteter und lokaler künstlerischer Kontrolle des finalen Bildinhalts geführt, welche in vorherigen, weniger strikten Paradigmen vorhanden war. Die Beiträge dieser Dissertation decken unterschiedliche Aspekte der Bildsynthese ab. Dies sind zunächst einmal die grundlegende Subpixel-Bildsynthese sowie effiziente Bildsyntheseverfahren für partizipierende Medien. Im Mittelpunkt der Arbeit stehen jedoch Ansätze zum effektiven visuellen Verständnis der Lichtausbreitung, die eine lokale künstlerische Einflussnahme ermöglichen und gleichzeitig auf globaler Ebene konsistente und glaubwürdige Ergebnisse erzielen. Hierbei ist die Kernidee, Visualisierung und Bearbeitung des Lichts direkt im alle möglichen Lichtpfade einschließenden "Pfadraum" durchzuführen. Dies steht im Gegensatz zu Verfahren nach Stand der Forschung, die entweder im Bildraum arbeiten oder auf bestimmte, isolierte Beleuchtungseffekte wie perfekte Spiegelungen, Schatten oder Kaustiken zugeschnitten sind. Die Erprobung der vorgestellten Verfahren hat gezeigt, dass mit ihnen real existierende Probleme der Bilderzeugung für Filmproduktionen gelöst werden können

    Multilinear methods for disentangling variations with applications to facial analysis

    Get PDF
    Several factors contribute to the appearance of an object in a visual scene, including pose, illumination, and deformation, among others. Each factor accounts for a source of variability in the data. It is assumed that the multiplicative interactions of these factors emulate the entangled variability, giving rise to the rich structure of visual object appearance. Disentangling such unobserved factors from visual data is a challenging task, especially when the data have been captured in uncontrolled recording conditions (also referred to as “in-the-wild”) and label information is not available. The work presented in this thesis focuses on disentangling the variations contained in visual data, in particular applied to 2D and 3D faces. The motivation behind this work lies in recent developments in the field, such as (i) the creation of large, visual databases for face analysis, with (ii) the need of extracting information without the use of labels and (iii) the need to deploy systems under demanding, real-world conditions. In the first part of this thesis, we present a method to synthesise plausible 3D expressions that preserve the identity of a target subject. This method is supervised as the model uses labels, in this case 3D facial meshes of people performing a defined set of facial expressions, to learn. The ability to synthesise an entire facial rig from a single neutral expression has a large range of applications both in computer graphics and computer vision, ranging from the ecient and cost-e↵ective creation of CG characters to scalable data generation for machine learning purposes. Unlike previous methods based on multilinear models, the proposed approach is capable to extrapolate well outside the sample pool, which allows it to accurately reproduce the identity of the target subject and create artefact-free expression shapes while requiring only a small input dataset. We introduce global-local multilinear models that leverage the strengths of expression-specific and identity-specific local models combined with coarse motion estimations from a global model. The expression-specific and identity-specific local models are built from di↵erent slices of the patch-wise local multilinear model. Experimental results show that we achieve high-quality, identity-preserving facial expression synthesis results that outperform existing methods both quantitatively and qualitatively. In the second part of this thesis, we investigate how the modes of variations from visual data can be extracted. Our assumption is that visual data has an underlying structure consisting of factors of variation and their interactions. Finding this structure and the factors is important as it would not only help us to better understand visual data but once obtained we can edit the factors for use in various applications. Shape from Shading and expression transfer are just two of the potential applications. To extract the factors of variation, several supervised methods have been proposed but they require both labels regarding the modes of variations and the same number of samples under all modes of variations. Therefore, their applicability is limited to well-organised data, usually captured in well-controlled conditions. We propose a novel general multilinear matrix decomposition method that discovers the multilinear structure of possibly incomplete sets of visual data in unsupervised setting. We demonstrate the applicability of the proposed method in several computer vision tasks, including Shape from Shading (SfS) (in the wild and with occlusion removal), expression transfer, and estimation of surface normals from images captured in the wild. Finally, leveraging the unsupervised multilinear method proposed as well as recent advances in deep learning, we propose a weakly supervised deep learning method for disentangling multiple latent factors of variation in face images captured in-the-wild. To this end, we propose a deep latent variable model, where we model the multiplicative interactions of multiple latent factors of variation explicitly as a multilinear structure. We demonstrate that the proposed approach indeed learns disentangled representations of facial expressions and pose, which can be used in various applications, including face editing, as well as 3D face reconstruction and classification of facial expression, identity and pose.Open Acces

    Appearance Preserving Rendering of Out-of-Core Polygon and NURBS Models

    Get PDF
    In Computer Aided Design (CAD) trimmed NURBS surfaces are widely used due to their flexibility. For rendering and simulation however, piecewise linear representations of these objects are required. A relatively new field in CAD is the analysis of long-term strain tests. After such a test the object is scanned with a 3d laser scanner for further processing on a PC. In all these areas of CAD the number of primitives as well as their complexity has grown constantly in the recent years. This growth is exceeding the increase of processor speed and memory size by far and posing the need for fast out-of-core algorithms. This thesis describes a processing pipeline from the input data in the form of triangular or trimmed NURBS models until the interactive rendering of these models at high visual quality. After discussing the motivation for this work and introducing basic concepts on complex polygon and NURBS models, the second part of this thesis starts with a review of existing simplification and tessellation algorithms. Additionally, an improved stitching algorithm to generate a consistent model after tessellation of a trimmed NURBS model is presented. Since surfaces need to be modified interactively during the design phase, a novel trimmed NURBS rendering algorithm is presented. This algorithm removes the bottleneck of generating and transmitting a new tessellation to the graphics card after each modification of a surface by evaluating and trimming the surface on the GPU. To achieve high visual quality, the appearance of a surface can be preserved using texture mapping. Therefore, a texture mapping algorithm for trimmed NURBS surfaces is presented. To reduce the memory requirements for the textures, the algorithm is modified to generate compressed normal maps to preserve the shading of the original surface. Since texturing is only possible, when a parametric mapping of the surface - requiring additional memory - is available, a new simplification and tessellation error measure is introduced that preserves the appearance of the original surface by controlling the deviation of normal vectors. The preservation of normals and possibly other surface attributes allows interactive visualization for quality control applications (e.g. isophotes and reflection lines). In the last part out-of-core techniques for processing and rendering of gigabyte-sized polygonal and trimmed NURBS models are presented. Then the modifications necessary to support streaming of simplified geometry from a central server are discussed and finally and LOD selection algorithm to support interactive rendering of hard and soft shadows is described

    Freeform User Interfaces for Graphical Computing

    Get PDF
    報告番号: 甲15222 ; 学位授与年月日: 2000-03-29 ; 学位の種別: 課程博士 ; 学位の種類: 博士(工学) ; 学位記番号: 博工第4717号 ; 研究科・専攻: 工学系研究科情報工学専

    Computer Graphics Learning Materials

    Get PDF
    Selles lõputöös on antud ülevaade Tartu Ülikooli aine Arvutigraafika (MTAT.03.015) jaoks koostatud õppematerjalist ja õppekeskkonnast. Kirjeldatud on aine modulaarset ülesehitust, mis rakendab kombineeritud ülevalt-alla (ing. k. top-down) ja alt-üles (ing. k. bottom-up) lähenemisi. Loodud õppematerjal sisaldab endas interaktiivseid näiteid, mis vastavad hõivatuse taksonoomia 4ndale tasemele. Õppekeskkonna CGLearn spetsifikatsioon ja implementatsiooni detailid on kirjeldatud. Töö lõpus on kursusel osalenud õpilaste hulgas läbi viidud tagasiside küsitluse tulemuste analüüsiga. Lisa fail on lingina kätesaadav serveri probleemide tõttu aadresil : http://comserv.cs.ut.ee/forms/ati_report/files/ComputerGraphicsLearningMaterialsAppendix.zipThis thesis provides an overview of the learning material and a custom learning environment created for the Computer Graphics (MTAT.03.015) course in the University of Tartu. It describes a modular layout, that mixes a top-down and bottom-up approaches, in which the course was organized. The created material also includes interactive examples that satisfy engagement level 4 requirements. The specification and implementation details of the custom learning environment called CGLearn are given. Thesis concludes with the analysis of the feedback questionnaire answered by the students participating in the course and using the material. Due to server problems extras file is in here : http://comserv.cs.ut.ee/forms/ati_report/files/ComputerGraphicsLearningMaterialsAppendix.zi
    corecore