56 research outputs found

    Latent-Space Laplacian Pyramids for Adversarial Representation Learning with 3D Point Clouds

    Full text link
    Constructing high-quality generative models for 3D shapes is a fundamental task in computer vision with diverse applications in geometry processing, engineering, and design. Despite the recent progress in deep generative modelling, synthesis of finely detailed 3D surfaces, such as high-resolution point clouds, from scratch has not been achieved with existing approaches. In this work, we propose to employ the latent-space Laplacian pyramid representation within a hierarchical generative model for 3D point clouds. We combine the recently proposed latent-space GAN and Laplacian GAN architectures to form a multi-scale model capable of generating 3D point clouds at increasing levels of detail. Our evaluation demonstrates that our model outperforms the existing generative models for 3D point clouds

    Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis

    Full text link
    We introduce a data-driven approach to complete partial 3D shapes through a combination of volumetric deep neural networks and 3D shape synthesis. From a partially-scanned input shape, our method first infers a low-resolution -- but complete -- output. To this end, we introduce a 3D-Encoder-Predictor Network (3D-EPN) which is composed of 3D convolutional layers. The network is trained to predict and fill in missing data, and operates on an implicit surface representation that encodes both known and unknown space. This allows us to predict global structure in unknown areas at high accuracy. We then correlate these intermediary results with 3D geometry from a shape database at test time. In a final pass, we propose a patch-based 3D shape synthesis method that imposes the 3D geometry from these retrieved shapes as constraints on the coarsely-completed mesh. This synthesis process enables us to reconstruct fine-scale detail and generate high-resolution output while respecting the global mesh structure obtained by the 3D-EPN. Although our 3D-EPN outperforms state-of-the-art completion method, the main contribution in our work lies in the combination of a data-driven shape predictor and analytic 3D shape synthesis. In our results, we show extensive evaluations on a newly-introduced shape completion benchmark for both real-world and synthetic data

    3D GANs and Latent Space: A comprehensive survey

    Full text link
    Generative Adversarial Networks (GANs) have emerged as a significant player in generative modeling by mapping lower-dimensional random noise to higher-dimensional spaces. These networks have been used to generate high-resolution images and 3D objects. The efficient modeling of 3D objects and human faces is crucial in the development process of 3D graphical environments such as games or simulations. 3D GANs are a new type of generative model used for 3D reconstruction, point cloud reconstruction, and 3D semantic scene completion. The choice of distribution for noise is critical as it represents the latent space. Understanding a GAN's latent space is essential for fine-tuning the generated samples, as demonstrated by the morphing of semantically meaningful parts of images. In this work, we explore the latent space and 3D GANs, examine several GAN variants and training methods to gain insights into improving 3D GAN training, and suggest potential future directions for further research

    GENERATIVE NETWORKS FOR POINT CLOUD GENERATION IN CULTURAL HERITAGE DOMAIN

    Get PDF
    none6noIn the Cultural Heritage (CH) domain, the semantic segmentation of 3D point clouds with Deep Learning (DL) techniques allows to recognize historical architectural elements, at a suitable level of detail, and hence expedite the process of modelling historical buildings for the development of BIM models from survey data. However, it is more difficult to collect a balanced dataset of labelled architectural elements for training a network. In fact, the CH objects are unique, and it is challenging for the network to recognize this kind of data. In recent years, Generative Networks have proven to be proper for generating new data. Starting from such premises, in this paper Generative Networks have been used for augmenting a CH dataset. In particular, the performances of three state-of-art Generative Networks such as PointGrow, PointFLow and PointGMM have been compared in terms of Jensen-Shannon Divergence (JSD), the Minimum Matching Distance-Chamfer Distance (MMD-CD) and the Minimum Matching Distance-Earth Mover’s Distance (MMD-EMD). The objects generated have been used for augmenting two classes of ArCH dataset, which are columns and windows. Then a DGCNN-Mod network was trained and tested for the semantic segmentation task, comparing the performance in the case of the ArCH dataset without and with augmentation.openRoberto Pierdicca, Marina Paolanti, Ramona Quattrini, Massimo Martini, Eva Savina Malinverni, Emanuele FrontoniPierdicca, Roberto; Paolanti, Marina; Quattrini, Ramona; Martini, Massimo; Malinverni, Eva Savina; Frontoni, Emanuel

    Adversarial Self-Supervised Scene Flow Estimation

    Get PDF
    This work proposes a metric learning approach for self-supervised scene flow estimation. Scene flow estimation is the task of estimating 3D flow vectors for consecutive 3D point clouds. Such flow vectors are fruitful, \eg for recognizing actions, or avoiding collisions. Training a neural network via supervised learning for scene flow is impractical, as this requires manual annotations for each 3D point at each new timestamp for each scene. To that end, we seek for a self-supervised approach, where a network learns a latent metric to distinguish between points translated by flow estimations and the target point cloud. Our adversarial metric learning includes a multi-scale triplet loss on sequences of two-point clouds as well as a cycle consistency loss. Furthermore, we outline a benchmark for self-supervised scene flow estimation: the Scene Flow Sandbox. The benchmark consists of five datasets designed to study individual aspects of flow estimation in progressive order of complexity, from a moving object to real-world scenes. Experimental evaluation on the benchmark shows that our approach obtains state-of-the-art self-supervised scene flow results, outperforming recent neighbor-based approaches. We use our proposed benchmark to expose shortcomings and draw insights on various training setups. We find that our setup captures motion coherence and preserves local geometries. Dealing with occlusions, on the other hand, is still an open challenge.Comment: Published at 3DV 202

    A Review on Deep Learning Techniques for Video Prediction

    Get PDF
    The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems. In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a promising research direction. Defined as a self-supervised learning task, video prediction represents a suitable framework for representation learning, as it demonstrated potential capabilities for extracting meaningful representations of the underlying patterns in natural videos. Motivated by the increasing interest in this task, we provide a review on the deep learning methods for prediction in video sequences. We firstly define the video prediction fundamentals, as well as mandatory background concepts and the most used datasets. Next, we carefully analyze existing video prediction models organized according to a proposed taxonomy, highlighting their contributions and their significance in the field. The summary of the datasets and methods is accompanied with experimental results that facilitate the assessment of the state of the art on a quantitative basis. The paper is summarized by drawing some general conclusions, identifying open research challenges and by pointing out future research directions.This work has been funded by the Spanish Government PID2019-104818RB-I00 grant for the MoDeaAS project, supported with Feder funds. This work has also been supported by two Spanish national grants for PhD studies, FPU17/00166, and ACIF/2018/197 respectively

    RGB to 3D garment reconstruction using UV map representations

    Get PDF
    Predicting the geometry of a 3D object from just a single image or viewpoint is an intrinsic human feature extremely challenging for machines. For years, in an attempt to solve this problem, different computer vision approaches and techniques have been investigated. One of the domains in which there has been more research has been the 3D reconstruction and modelling of human bodies. However, the greatest advances in this field have concentrated on the recovery of unclothed human bodies, ignoring garments. Garments are highly detailed, dynamic objects made up of particles that interact with each other and with other objects, making the task of reconstruction even more difficult. Therefore, having a lightweight 3D representation capable of modelling fine details is of great importance. This thesis presents a deep learning framework based on Generative Adversarial Networks (GANs) to reconstruct 3D garment models from a single RGB image. It has the peculiarity of using UV maps to represent 3D data, a lightweight representation capable of dealing with high-resolution details and wrinkles. With this model and kind of 3D representation, we achieve state-of-the-art results on CLOTH3D dataset, generating good quality and realistic reconstructions regardless of the garment topology, human pose, occlusions and lightning, and thus demonstrating the suitability of UV maps for 3D domains and tasks
    corecore