168 research outputs found

    Transferrable learning from synthetic data: novel texture synthesis using Domain Randomization for visual scene understanding

    Get PDF
    Modern supervised deep learning-based approaches typically rely on vast quantities of annotated data for training computer vision and robotics tasks. A key challenge is acquiring data that encompasses the diversity encountered in the real world. The use of synthetic or computer-generated data for solving these tasks has recently garnered attention for several reasons. The first being the efficiency of producing large amounts of annotated data at a fraction of the time required in reality, addressing the time expense of manually annotated data. The second addresses the inaccuracies and mistakes arising from the laborious task of manual annotations. Thirdly, it addresses the need for vast amounts of data typically required by data-driven state-of-the-art computer vision and robotics systems. Due to domain shift, models trained on synthetic data typically underperform those trained on real-world data when deployed in the real world. Domain Randomization is a data generation approach for the synthesis of artificial data. The Domain Randomization process can generate diverse synthetic images by randomizing rendering parameters in a simulator, such as the objects, their visual appearance, the lighting, and where they appear in the picture. This synthetic data can be used to train systems capable of performing well in reality. However, it is unclear how to best approach selecting Domain Randomization parameters such as the types of textures, object poses, or types of backgrounds. Furthermore, it is unclear how Domain Randomization generalizes across various vision tasks or whether there are potential improvements to the technique. This thesis explores novel Domain Randomization techniques to solve object localization, detection, and semantic segmentation in cluttered and occluded real-world scenarios. In particular, the four main contributions of this dissertation are: (i) The first contribution of the thesis proposes a novel method for quantifying the differences between Domain Randomized and realistic data distributions using a small number of samples. The approach ranks all commonly applied Domain Randomization texture techniques in the existing literature and finds that the ranking is reflected in the task-based performance of an object localization task. (ii) The second contribution of this work introduces the SRDR dataset - a large domain randomized dataset containing 291K frames of household objects widely used in robotics andvision benchmarking [23]. SRDR builds on the YCB-M [67] dataset by generating syntheticversions for images in YCB-M using a variety of domain randomized texture types and in 5 unique environments with varying scene complexity. The SRDR dataset is highly beneficial in cross-domain training, evaluation, and comparison investigations. (iii) The third contribution presents a study evaluating Domain Randomization’s generalizabilityand robustness in sim-to-real in complex scenes for object detection and semantic segmentation. We find that the performance ranking is largely similar across the two tasks when evaluating models trained on Domain Randomized synthetic data and evaluating on real-world data, indicating Domain Randomization performs similarly across multiple tasks. (iv) Finally, we present a fast, easy to execute, novel approach for conditionally generating domain randomized textures. The textures are generated by randomly sampling patches from real-world images to apply to objects of interest. This approach outperforms the most commonly used Domain Randomization texture method from 13.157 AP to 21.287 AP and 8.950 AP to 19.481 AP in object detection and semantic segmentation tasks. The technique eliminates manually defining texture distributions to sample Domain Randomized textures. We propose a further improvement to address low texture diversity when using a small number of real-world images. We propose to use a conditional GAN-based texture generator trained on a few real-world image patches to increase the texture diversity and outperform the most commonly applied Domain Randomization texture method from 13.157 AP to 20.287 AP and 8.950 AP to 17.636 AP in object detection and semantic segmentation tasks

    Highly automated method for facial expression synthesis

    Get PDF
    The synthesis of realistic facial expressions has been an unexplored area for computer graphics scientists. Over the last three decades, several different construction methods have been formulated in order to obtain natural graphic results. Despite these advancements, though, current techniques still require costly resources, heavy user intervention and specific training and outcomes are still not completely realistic. This thesis, therefore, aims to achieve an automated synthesis that will produce realistic facial expressions at a low cost. This thesis, proposes a highly automated approach for achieving a realistic facial expression synthesis, which allows for enhanced performance in speed (3 minutes processing time maximum) and quality with a minimum of user intervention. It will also demonstrate a highly technical and automated method of facial feature detection, by allowing users to obtain their desired facial expression synthesis with minimal physical input. Moreover, it will describe a novel approach to the normalization of the illumination settings values between source and target images, thereby allowing the algorithm to work accurately, even in different lighting conditions. Finally, we will present the results obtained from the proposed techniques, together with our conclusions, at the end of the paper.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Depicting shape, materials and lighting: observation, formulation and implementation of artistic principles

    Get PDF
    The appearance of a scene results from complex interactions between the geometry, materials and lights that compose that scene. While Computer Graphics algorithms are now capable of simulating these interactions, it comes at the cost of tedious 3D modeling of a virtual scene, which only well-trained artists can do. In contrast, photographs allow the instantaneous capture of a scene, but shape, materials and lighting are difficult to manipulate directly in the image. Drawings can also suggest real or imaginary scenes with a few lines but creating convincing illustrations requires significant artistic skills.The goal of my research is to facilitate the creation and manipulation of shape, materials and lighting in drawings and photographs, for laymen and professional artists alike. This document first presents my work on computer-assisted drawing where I proposed algorithms to automate the depiction of materials in line drawings as well as to estimate a 3D model from design sketches. I also worked on user interfaces to assist beginners in learning traditional drawing techniques. Through the development of these projects I have formalized a general methodology to observe how artists work, deduce artistic principles from these observations, and implement these principles as algorithms. In the second part of this document I present my work on relighting multiple photographs of a scene, for which we first need to estimate the materials and lighting that compose that scene. The main novelty of our approach is to combine image analysis and lighting simulation in order to reason about the scene despite the lack of an accurate 3D model

    AFFECT-PRESERVING VISUAL PRIVACY PROTECTION

    Get PDF
    The prevalence of wireless networks and the convenience of mobile cameras enable many new video applications other than security and entertainment. From behavioral diagnosis to wellness monitoring, cameras are increasing used for observations in various educational and medical settings. Videos collected for such applications are considered protected health information under privacy laws in many countries. Visual privacy protection techniques, such as blurring or object removal, can be used to mitigate privacy concern, but they also obliterate important visual cues of affect and social behaviors that are crucial for the target applications. In this dissertation, we propose to balance the privacy protection and the utility of the data by preserving the privacy-insensitive information, such as pose and expression, which is useful in many applications involving visual understanding. The Intellectual Merits of the dissertation include a novel framework for visual privacy protection by manipulating facial image and body shape of individuals, which: (1) is able to conceal the identity of individuals; (2) provide a way to preserve the utility of the data, such as expression and pose information; (3) balance the utility of the data and capacity of the privacy protection. The Broader Impacts of the dissertation focus on the significance of privacy protection on visual data, and the inadequacy of current privacy enhancing technologies in preserving affect and behavioral attributes of the visual content, which are highly useful for behavior observation in educational and medical settings. This work in this dissertation represents one of the first attempts in achieving both goals simultaneously
    • …
    corecore