235 research outputs found

    Weakly supervised 3D Reconstruction with Adversarial Constraint

    Full text link
    Supervised 3D reconstruction has witnessed a significant progress through the use of deep neural networks. However, this increase in performance requires large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D supervision as an alternative for expensive 3D CAD annotation. Specifically, we use foreground masks as weak supervision through a raytrace pooling layer that enables perspective projection and backpropagation. Additionally, since the 3D reconstruction from masks is an ill posed problem, we propose to constrain the 3D reconstruction to the manifold of unlabeled realistic 3D shapes that match mask observations. We demonstrate that learning a log-barrier solution to this constrained optimization problem resembles the GAN objective, enabling the use of existing tools for training GANs. We evaluate and analyze the manifold constrained reconstruction on various datasets for single and multi-view reconstruction of both synthetic and real images

    DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image

    Full text link
    3D reconstruction from a single image is a key problem in multiple applications ranging from robotic manipulation to augmented reality. Prior methods have tackled this problem through generative models which predict 3D reconstructions as voxels or point clouds. However, these methods can be computationally expensive and miss fine details. We introduce a new differentiable layer for 3D data deformation and use it in DeformNet to learn a model for 3D reconstruction-through-deformation. DeformNet takes an image input, searches the nearest shape template from a database, and deforms the template to match the query image. We evaluate our approach on the ShapeNet dataset and show that - (a) the Free-Form Deformation layer is a powerful new building block for Deep Learning models that manipulate 3D data (b) DeformNet uses this FFD layer combined with shape retrieval for smooth and detail-preserving 3D reconstruction of qualitatively plausible point clouds with respect to a single query image (c) compared to other state-of-the-art 3D reconstruction methods, DeformNet quantitatively matches or outperforms their benchmarks by significant margins. For more information, visit: https://deformnet-site.github.io/DeformNet-website/ .Comment: 11 pages, 9 figures, NIP

    Accelerated volumetric reconstruction from uncalibrated camera views

    Get PDF
    While both work with images, computer graphics and computer vision are inverse problems. Computer graphics starts traditionally with input geometric models and produces image sequences. Computer vision starts with input image sequences and produces geometric models. In the last few years, there has been a convergence of research to bridge the gap between the two fields. This convergence has produced a new field called Image-based Rendering and Modeling (IBMR). IBMR represents the effort of using the geometric information recovered from real images to generate new images with the hope that the synthesized ones appear photorealistic, as well as reducing the time spent on model creation. In this dissertation, the capturing, geometric and photometric aspects of an IBMR system are studied. A versatile framework was developed that enables the reconstruction of scenes from images acquired with a handheld digital camera. The proposed system targets applications in areas such as Computer Gaming and Virtual Reality, from a lowcost perspective. In the spirit of IBMR, the human operator is allowed to provide the high-level information, while underlying algorithms are used to perform low-level computational work. Conforming to the latest architecture trends, we propose a streaming voxel carving method, allowing a fast GPU-based processing on commodity hardware

    3D Scene Reconstruction with Micro-Aerial Vehicles and Mobile Devices

    Full text link
    Scene reconstruction is the process of building an accurate geometric model of one\u27s environment from sensor data. We explore the problem of real-time, large-scale 3D scene reconstruction in indoor environments using small laser range-finders and low-cost RGB-D (color plus depth) cameras. We focus on computationally-constrained platforms such as micro-aerial vehicles (MAVs) and mobile devices. These platforms present a set of fundamental challenges - estimating the state and trajectory of the device as it moves within its environment and utilizing lightweight, dynamic data structures to hold the representation of the reconstructed scene. The system needs to be computationally and memory-efficient, so that it can run in real time, onboard the platform. In this work, we present three scene reconstruction systems. The first system uses a laser range-finder and operates onboard a quadrotor MAV. We address the issues of autonomous control, state estimation, path-planning, and teleoperation. We propose the multi-volume occupancy grid (MVOG) - a novel data structure for building 3D maps from laser data, which provides a compact, probabilistic scene representation. The second system uses an RGB-D camera to recover the 6-DoF trajectory of the platform by aligning sparse features observed in the current RGB-D image against a model of previously seen features. We discuss our work on camera calibration and the depth measurement model. We apply the system onboard an MAV to produce occupancy-based 3D maps, which we utilize for path-planning. Finally, we present our contributions to a scene reconstruction system for mobile devices with built-in depth sensing and motion-tracking capabilities. We demonstrate reconstructing and rendering a global mesh on the fly, using only the mobile device\u27s CPU, in very large (300 square meter) scenes, at a resolutions of 2-3cm. To achieve this, we divide the scene into spatial volumes indexed by a hash map. Each volume contains the truncated signed distance function for that area of space, as well as the mesh segment derived from the distance function. This approach allows us to focus computational and memory resources only in areas of the scene which are currently observed, as well as leverage parallelization techniques for multi-core processing

    From light rays to 3D models

    Get PDF

    Analysis of 3D human gait reconstructed with a depth camera and mirrors

    Full text link
    L'Ă©valuation de la dĂ©marche humaine est l'une des composantes essentielles dans les soins de santĂ©. Les systĂšmes Ă  base de marqueurs avec plusieurs camĂ©ras sont largement utilisĂ©s pour faire cette analyse. Cependant, ces systĂšmes nĂ©cessitent gĂ©nĂ©ralement des Ă©quipements spĂ©cifiques Ă  prix Ă©levĂ© et/ou des moyens de calcul intensif. Afin de rĂ©duire le coĂ»t de ces dispositifs, nous nous concentrons sur un systĂšme d'analyse de la marche qui utilise une seule camĂ©ra de profondeur. Le principe de notre travail est similaire aux systĂšmes multi-camĂ©ras, mais l'ensemble de camĂ©ras est remplacĂ© par un seul capteur de profondeur et des miroirs. Chaque miroir dans notre configuration joue le rĂŽle d'une camĂ©ra qui capture la scĂšne sous un point de vue diffĂ©rent. Puisque nous n'utilisons qu'une seule camĂ©ra, il est ainsi possible d'Ă©viter l'Ă©tape de synchronisation et Ă©galement de rĂ©duire le coĂ»t de l'appareillage. Notre thĂšse peut ĂȘtre divisĂ©e en deux sections: reconstruction 3D et analyse de la marche. Le rĂ©sultat de la premiĂšre section est utilisĂ© comme entrĂ©e de la seconde. Notre systĂšme pour la reconstruction 3D est constituĂ© d'une camĂ©ra de profondeur et deux miroirs. Deux types de capteurs de profondeur, qui se distinguent sur la base du mĂ©canisme d'estimation de profondeur, ont Ă©tĂ© utilisĂ©s dans nos travaux. Avec la technique de lumiĂšre structurĂ©e (SL) intĂ©grĂ©e dans le capteur Kinect 1, nous effectuons la reconstruction 3D Ă  partir des principes de l'optique gĂ©omĂ©trique. Pour augmenter le niveau des dĂ©tails du modĂšle reconstruit en 3D, la Kinect 2 qui estime la profondeur par temps de vol (ToF), est ensuite utilisĂ©e pour l'acquisition d'images. Cependant, en raison de rĂ©flections multiples sur les miroirs, il se produit une distorsion de la profondeur dans notre systĂšme. Nous proposons donc une approche simple pour rĂ©duire cette distorsion avant d'appliquer les techniques d'optique gĂ©omĂ©trique pour reconstruire un nuage de points de l'objet 3D. Pour l'analyse de la dĂ©marche, nous proposons diverses alternatives centrĂ©es sur la normalitĂ© de la marche et la mesure de sa symĂ©trie. Cela devrait ĂȘtre utile lors de traitements cliniques pour Ă©valuer, par exemple, la rĂ©cupĂ©ration du patient aprĂšs une intervention chirurgicale. Ces mĂ©thodes se composent d'approches avec ou sans modĂšle qui ont des inconvĂ©nients et avantages diffĂ©rents. Dans cette thĂšse, nous prĂ©sentons 3 mĂ©thodes qui traitent directement les nuages de points reconstruits dans la section prĂ©cĂ©dente. La premiĂšre utilise la corrĂ©lation croisĂ©e des demi-corps gauche et droit pour Ă©valuer la symĂ©trie de la dĂ©marche, tandis que les deux autres methodes utilisent des autoencodeurs issus de l'apprentissage profond pour mesurer la normalitĂ© de la dĂ©marche.The problem of assessing human gaits has received a great attention in the literature since gait analysis is one of key components in healthcare. Marker-based and multi-camera systems are widely employed to deal with this problem. However, such systems usually require specific equipments with high price and/or high computational cost. In order to reduce the cost of devices, we focus on a system of gait analysis which employs only one depth sensor. The principle of our work is similar to multi-camera systems, but the collection of cameras is replaced by one depth sensor and mirrors. Each mirror in our setup plays the role of a camera which captures the scene at a different viewpoint. Since we use only one camera, the step of synchronization can thus be avoided and the cost of devices is also reduced. Our studies can be separated into two categories: 3D reconstruction and gait analysis. The result of the former category is used as the input of the latter one. Our system for 3D reconstruction is built with a depth camera and two mirrors. Two types of depth sensor, which are distinguished based on the scheme of depth estimation, have been employed in our works. With the structured light (SL) technique integrated into the Kinect 1, we perform the 3D reconstruction based on geometrical optics. In order to increase the level of details of the 3D reconstructed model, the Kinect 2 with time-of-flight (ToF) depth measurement is used for image acquisition instead of the previous generation. However, due to multiple reflections on the mirrors, depth distortion occurs in our setup. We thus propose a simple approach for reducing such distortion before applying geometrical optics to reconstruct a point cloud of the 3D object. For the task of gait analysis, we propose various alternative approaches focusing on the problem of gait normality/symmetry measurement. They are expected to be useful for clinical treatments such as monitoring patient's recovery after surgery. These methods consist of model-free and model-based approaches that have different cons and pros. In this dissertation, we present 3 methods that directly process point clouds reconstructed from the previous work. The first one uses cross-correlation of left and right half-bodies to assess gait symmetry while the other ones employ deep auto-encoders to measure gait normality
    • 

    corecore