36 research outputs found

    3D Face Tracking and Texture Fusion in the Wild

    Full text link
    We present a fully automatic approach to real-time 3D face reconstruction from monocular in-the-wild videos. With the use of a cascaded-regressor based face tracking and a 3D Morphable Face Model shape fitting, we obtain a semi-dense 3D face shape. We further use the texture information from multiple frames to build a holistic 3D face representation from the video frames. Our system is able to capture facial expressions and does not require any person-specific training. We demonstrate the robustness of our approach on the challenging 300 Videos in the Wild (300-VW) dataset. Our real-time fitting framework is available as an open source library at http://4dface.org

    Towards Skeleton based Reconstruction : From Projective Skeletonization to Canal Surface Estimation

    Get PDF
    International audienceWe present a novel approach to reconstruct a 3D object from images corresponding to two different viewpoints: we estimate the skeleton of the object instead of its surface. The originality of the method is to be able to reconstruct a tubular object with a limited number of input images. Unlike classical reconstruction methods, like multi-view stereo or more recently structure-from-motion, this approach does not rely on interest points but estimates the topology of the object and derives its surface. Our contribution are twofold. First, given two perspective images of the 3D shape, the projection of the skeleton is computed in 2D. Secondly the 3D skeleton is reconstructed from the two projections using triangulation and matching. A mesh is finally derived for each skeleton branch

    Point Pair Feature based Object Detection for Random Bin Picking

    Full text link
    Point pair features are a popular representation for free form 3D object detection and pose estimation. In this paper, their performance in an industrial random bin picking context is investigated. A new method to generate representative synthetic datasets is proposed. This allows to investigate the influence of a high degree of clutter and the presence of self similar features, which are typical to our application. We provide an overview of solutions proposed in literature and discuss their strengths and weaknesses. A simple heuristic method to drastically reduce the computational complexity is introduced, which results in improved robustness, speed and accuracy compared to the naive approach

    Learned Semantic Multi-Sensor Depth Map Fusion

    Full text link
    Volumetric depth map fusion based on truncated signed distance functions has become a standard method and is used in many 3D reconstruction pipelines. In this paper, we are generalizing this classic method in multiple ways: 1) Semantics: Semantic information enriches the scene representation and is incorporated into the fusion process. 2) Multi-Sensor: Depth information can originate from different sensors or algorithms with very different noise and outlier statistics which are considered during data fusion. 3) Scene denoising and completion: Sensors can fail to recover depth for certain materials and light conditions, or data is missing due to occlusions. Our method denoises the geometry, closes holes and computes a watertight surface for every semantic class. 4) Learning: We propose a neural network reconstruction method that unifies all these properties within a single powerful framework. Our method learns sensor or algorithm properties jointly with semantic depth fusion and scene completion and can also be used as an expert system, e.g. to unify the strengths of various photometric stereo algorithms. Our approach is the first to unify all these properties. Experimental evaluations on both synthetic and real data sets demonstrate clear improvements.Comment: 11 pages, 7 figures, 2 tables, accepted for the 2nd Workshop on 3D Reconstruction in the Wild (3DRW2019) in conjunction with ICCV201

    Capturing Synchronous Collaborative Design Activities: A State-Of-The-Art Technology Review

    Get PDF

    Monocular 3D Body Shape Reconstruction under Clothing

    Get PDF
    Estimating the 3D shape of objects from monocular images is a well-established and challenging task in the computer vision field. Further challenges arise when highly deformable objects, such as human faces or bodies, are considered. In this work, we address the problem of estimating the 3D shape of a human body from single images. In particular, we provide a solution to the problem of estimating the shape of the body when the subject is wearing clothes. This is a highly challenging scenario as loose clothes might hide the underlying body shape to a large extent. To this aim, we make use of a parametric 3D body model, the SMPL, whose parameters describe the body pose and shape of the body. Our main intuition is that the shape parameters associated with an individual should not change whether the subject is wearing clothes or not. To improve the shape estimation under clothing, we train a deep convolutional network to regress the shape parameters from a single image of a person. To increase the robustness to clothing, we build our training dataset by associating the shape parameters of a “minimally clothed” person to other samples of the same person wearing looser clothes. Experimental validation shows that our approach can more accurately estimate body shape parameters with respect to state-of-the-art approaches, even in the case of loose clothes

    On the Production of Semantic and Textured 3D Meshes of Large scale Urban Environments from Mobile Mapping Images and LiDAR scans

    Get PDF
    International audienceDans cet article nous présentons un cadre entièrement au-tomatique pour la reconstruction d'un maillage, sa textu-ration et sa sémantisation à large échelle à partir de scans LiDAR et d'images orientées de scènes urbaines collectés par une plateforme de cartographie mobile terrestre. Tout d'abord, les points et les images georéferencés sont dé-coupés temporellement pour assurer une cohèrence entre la geométrie (les points) et la photométrie (les images). Ensuite, une reconstruction de surface 3D simple et ra-pide basée sur la topologie d'acquisition du capteur est effectuée sur chaque segment après un rééchantillonnage du nuage de points obtenu à partir des balayages LiDAR. L'algorithme de [31] est par la suite adapté pour texturer la surface reconstruite avec les images acquises simultané-ment assurant une texture de haute qualité et un ajustement photométrique global. Enfin, en se basant sur le schéma de texturation, une sémantisation par texel est appliquée sur le modèle final. Mots Clef scène urbaine, cartographie mobile, LiDAR, reconstruction de surface, texturation, sémantisation, apprentissage pro-fond. Abstract In this paper we present a fully automatic framework for the reconstruction of a 3D mesh, its texture mapping and its semantization using oriented images and LiDAR scans acquired in a large urban area by a terrestrial Mobile Mapping System (MMS). First, the acquired points and images are sliced into temporal chunks ensuring a reasonable size and time consistency between geometry (points) and pho-tometry (images). Then, a simple and fast 3D surface reconstruction relying on the sensor space topology is performed on each chunk after an isotropic sampling of the point cloud obtained from the raw LiDAR scans. The method of [31] is subsequently adapted to texture the reconstructed surface with the images acquired simultaneously, ensuring a high quality texture and global color adjustment. Finally, based on the texturing scheme a per-texel semantization is conducted on the final model

    Eigenloss: Combined PCA-Based Loss Function for Polyp Segmentation

    Get PDF
    Colorectal cancer is one of the leading cancer death causes worldwide, but its early diagnosis highly improves the survival rates. The success of deep learning has also benefited this clinical field. When training a deep learning model, it is optimized based on the selected loss function. In this work, we consider two networks (U-Net and LinkNet) and two backbones (VGG-16 and Densnet121). We analyzed the influence of seven loss functions and used a principal component analysis (PCA) to determine whether the PCA-based decomposition allows for the defining of the coefficients of a non-redundant primal loss function that can outperform the individual loss functions and different linear combinations. The eigenloss is defined as a linear combination of the individual losses using the elements of the eigenvector as coefficients. Empirical results show that the proposed eigenloss improves the general performance of individual loss functions and outperforms other linear combinations when Linknet is used, showing potential for its application in polyp segmentation problems

    Apprentissage de la Cohérence Photométrique pour la Reconstruction de Formes Multi-Vues

    Get PDF
    International audienceWith the rise of augmented and virtual reality, estimating accurate shapes from multi-view RGB images is becoming an important task in computer vision. The dominant strategy employed for that purpose in the recent years relies on depth maps estimation followed by depth fusion, as depth maps prove to be efficient in recovering local surface details. Motivated by recent success of convolutional neural networks, we take this strategy a step further and present a novel solution for depth map estimation which consists in sweeping a volume along projected rays from a camera, and inferring surface presence probability at a point, seen by an arbitrary number of cameras. A strong motivation behind this work is to study the ability of learning based features to outperform traditional 2D features when estimating depth from multi-view cues. Especially with real life dynamic scenes, containing multiple moving subjects with complex surface details, scenarios where previous image based MVS methods fail to recover accurate details. Our results demonstrate this ability, showing that a CNN, trained on a standard static dataset, can help recovering surface details on dynamic scenes that are not visible to traditional 2D feature based methods. In addition, our evaluation also includes a comparison to existing reconstruction pipelines on the standard evaluation dataset we used to train our network with, showing that our solution performs on par or better than these approaches.L'essor des technologies de réalité virtuelle et augmentée s'accompagne d'un besoin accru de contenus appropriés à ces technologies et à leurs méthodes de visualisation. En particulier, la capacité à produire des contenus réels visualisables en 3D devient prépondérante. Nous considérons dans cet article le problème de la reconstruction de scènes 3D dynamiques à partir d'images couleurs. Nous intéressons tout particulièrement à la possibilité de bénéficier des réseaux de neurones convolutifs dans ce processus de reconstruction pour l'améliorer de manière effective. Les méthodes les plus récentes de reconstruction multi-vues estiment des cartes de profondeur par vue et fusionnent ensuite ces cartes dans une forme implicite 3D. Une étape clé de ces méthodes réside dans l'estimation des cartes de profondeurs. Cette étape est traditionnellement effectuée par la recherche de correspondances multi-vues à l'aide de critères de photo-cohérence. Nous proposons ici d'apprendre cette fonction de photo-cohérence sur des exemples au lieu de la définir à travers la corrélation de descripteurs photométriques, comme c'est le cas dans la plupart des méthodes actuelles. L'intuition est que la corrélation de descripteurs d'images est intrinsèquement contrainte et limitée, et que les réseaux profonds ont la capacité d'apprendre des configurations plus larges. Nos résultats sur des données réelles démontrent que cela est le cas. Entraîné sur un jeu de données statiques standard, les réseaux de convolution nous permettent de récupérer des détails sur une forme en mouvement que les descripteurs d'images classiques ne peuvent extraire. Les évaluations comparatives sur ces données standards sont par ailleurs favorables à la méthode que nous proposons
    corecore