8,953 research outputs found

    Multi-View Data Generation Without View Supervision

    Get PDF
    The development of high-dimensional generative models has recently gained a great surge of interest with the introduction of variational auto-encoders and generative adversarial neural networks. Different variants have been proposed where the underlying latent space is structured, for example, based on attributes describing the data to generate. We focus on a particular problem where one aims at generating samples corresponding to a number of objects under various views. We assume that the distribution of the data is driven by two independent latent factors: the content, which represents the intrinsic features of an object, and the view, which stands for the settings of a particular observation of that object. Therefore, we propose a generative model and a conditional variant built on such a disentangled latent space. This approach allows us to generate realistic samples corresponding to various objects in a high variety of views. Unlike many multi-view approaches, our model doesn't need any supervision on the views but only on the content. Compared to other conditional generation approaches that are mostly based on binary or categorical attributes, we make no such assumption about the factors of variations. Our model can be used on problems with a huge, potentially infinite, number of categories. We experiment it on four image datasets on which we demonstrate the effectiveness of the model and its ability to generalize.Comment: Published as a conference paper at ICLR 201

    Dense 3D Object Reconstruction from a Single Depth View

    Get PDF
    In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks. Unlike existing work which typically requires multiple views of the same object or class labels to recover the full 3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation of a depth view of the object as input, and is able to generate the complete 3D occupancy grid with a high resolution of 256^3 by recovering the occluded/missing regions. The key idea is to combine the generative capabilities of autoencoders and the conditional Generative Adversarial Networks (GAN) framework, to infer accurate and fine-grained 3D structures of objects in high-dimensional voxel space. Extensive experiments on large synthetic datasets and real-world Kinect datasets show that the proposed 3D-RecGAN++ significantly outperforms the state of the art in single view 3D object reconstruction, and is able to reconstruct unseen types of objects.Comment: TPAMI 2018. Code and data are available at: https://github.com/Yang7879/3D-RecGAN-extended. This article extends from arXiv:1708.0796

    Associative3D: Volumetric Reconstruction from Sparse Views

    Full text link
    This paper studies the problem of 3D volumetric reconstruction from two views of a scene with an unknown camera. While seemingly easy for humans, this problem poses many challenges for computers since it requires simultaneously reconstructing objects in the two views while also figuring out their relationship. We propose a new approach that estimates reconstructions, distributions over the camera/object and camera/camera transformations, as well as an inter-view object affinity matrix. This information is then jointly reasoned over to produce the most likely explanation of the scene. We train and test our approach on a dataset of indoor scenes, and rigorously evaluate the merits of our joint reasoning approach. Our experiments show that it is able to recover reasonable scenes from sparse views, while the problem is still challenging. Project site: https://jasonqsy.github.io/Associative3DComment: ECCV 202

    Virtual patient-specific treatment verification using machine learning methods to assist the dose deliverability evaluation of radiotherapy prostate plans

    Get PDF
    Machine Learning (ML) methods represent a potential tool to support and optimize virtual patient-specific plan verifications within radiotherapy workflows. However, previously reported applications did not consider the actual physical implications in the predictor’s quality and modelperformance and did not report the implementation pertinence nor their limitations. Therefore, the main goal of this thesis was to predict dose deliverability using different ML models and input predictor features, analysing the physical aspects involved in the predictions to propose areliable decision-support tool for virtual patient-specific plan verification protocols. Among the principal predictors explored in this thesis, numerical and high-dimensional features based on modulation complexity, treatment-unit parameters, and dosimetric plan parameters were all implemented by designing random forest (RF), extreme gradient boosting (XG-Boost), neural networks (NN), and convolutional neural networks (CNN) models to predict gamma passing rates (GPR) for prostate treatments. Accordingly, this research highlights three principal findings. (1) The dataset composition's heterogeneity directly impacts the quality of the predictor features and, subsequently, the model performance. (2) The models based on automatic extracted features methods (CNN models) of multi-leaf-collimator modulation maps (MM) presented a more independent and transferable prediction performance. Furthermore, (3) ML algorithms incorporated in radiotherapy workflows for virtual plan verification are required to retrieve treatment plan parameters associated with the prediction to support themodel's reliability and stability. Finally, this thesis presents how the most relevant automatically extracted features from the activation maps were considered to suggest an alternative decision support tool to comprehensively evaluate the causes of the predicted dose deliverability
    corecore