8,953 research outputs found
Multi-View Data Generation Without View Supervision
The development of high-dimensional generative models has recently gained a
great surge of interest with the introduction of variational auto-encoders and
generative adversarial neural networks. Different variants have been proposed
where the underlying latent space is structured, for example, based on
attributes describing the data to generate. We focus on a particular problem
where one aims at generating samples corresponding to a number of objects under
various views. We assume that the distribution of the data is driven by two
independent latent factors: the content, which represents the intrinsic
features of an object, and the view, which stands for the settings of a
particular observation of that object. Therefore, we propose a generative model
and a conditional variant built on such a disentangled latent space. This
approach allows us to generate realistic samples corresponding to various
objects in a high variety of views. Unlike many multi-view approaches, our
model doesn't need any supervision on the views but only on the content.
Compared to other conditional generation approaches that are mostly based on
binary or categorical attributes, we make no such assumption about the factors
of variations. Our model can be used on problems with a huge, potentially
infinite, number of categories. We experiment it on four image datasets on
which we demonstrate the effectiveness of the model and its ability to
generalize.Comment: Published as a conference paper at ICLR 201
Dense 3D Object Reconstruction from a Single Depth View
In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs
the complete 3D structure of a given object from a single arbitrary depth view
using generative adversarial networks. Unlike existing work which typically
requires multiple views of the same object or class labels to recover the full
3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation
of a depth view of the object as input, and is able to generate the complete 3D
occupancy grid with a high resolution of 256^3 by recovering the
occluded/missing regions. The key idea is to combine the generative
capabilities of autoencoders and the conditional Generative Adversarial
Networks (GAN) framework, to infer accurate and fine-grained 3D structures of
objects in high-dimensional voxel space. Extensive experiments on large
synthetic datasets and real-world Kinect datasets show that the proposed
3D-RecGAN++ significantly outperforms the state of the art in single view 3D
object reconstruction, and is able to reconstruct unseen types of objects.Comment: TPAMI 2018. Code and data are available at:
https://github.com/Yang7879/3D-RecGAN-extended. This article extends from
arXiv:1708.0796
Associative3D: Volumetric Reconstruction from Sparse Views
This paper studies the problem of 3D volumetric reconstruction from two views
of a scene with an unknown camera. While seemingly easy for humans, this
problem poses many challenges for computers since it requires simultaneously
reconstructing objects in the two views while also figuring out their
relationship. We propose a new approach that estimates reconstructions,
distributions over the camera/object and camera/camera transformations, as well
as an inter-view object affinity matrix. This information is then jointly
reasoned over to produce the most likely explanation of the scene. We train and
test our approach on a dataset of indoor scenes, and rigorously evaluate the
merits of our joint reasoning approach. Our experiments show that it is able to
recover reasonable scenes from sparse views, while the problem is still
challenging. Project site: https://jasonqsy.github.io/Associative3DComment: ECCV 202
Virtual patient-specific treatment verification using machine learning methods to assist the dose deliverability evaluation of radiotherapy prostate plans
Machine Learning (ML) methods represent a potential tool to support and optimize virtual patient-specific plan verifications within radiotherapy workflows. However, previously reported applications did not consider the actual physical implications in the predictor’s quality and modelperformance and did not report the implementation pertinence nor their limitations. Therefore, the main goal of this thesis was to predict dose deliverability using different ML models and input predictor features, analysing the physical aspects involved in the predictions to propose areliable decision-support tool for virtual patient-specific plan verification protocols. Among the principal predictors explored in this thesis, numerical and high-dimensional features based on modulation complexity, treatment-unit parameters, and dosimetric plan parameters were all implemented by designing random forest (RF), extreme gradient boosting (XG-Boost), neural networks (NN), and convolutional neural networks (CNN) models to predict gamma passing rates (GPR) for prostate treatments. Accordingly, this research highlights three principal findings. (1) The dataset composition's heterogeneity directly impacts the quality of the predictor features and, subsequently, the model performance. (2) The models based on automatic extracted features methods (CNN models) of multi-leaf-collimator modulation maps (MM) presented a more independent and transferable prediction performance. Furthermore, (3) ML algorithms incorporated in radiotherapy workflows for virtual plan verification are required to retrieve treatment plan parameters associated with the prediction to support themodel's reliability and stability. Finally, this thesis presents how the most relevant automatically extracted features from the activation maps were considered to suggest an alternative decision support tool to comprehensively evaluate the causes of the predicted dose deliverability
- …