9 research outputs found

    Fast capture of textured full-body avatar with RGB-D cameras

    Get PDF
    We present a practical system which can provide a textured full-body avatar within three seconds. It uses sixteen RGB-depth (RGB-D) cameras, ten of which are arranged to capture the body, while six target the important head region. The configuration of the multiple cameras is formulated as a constraint-based minimum set space-covering problem, which is approximately solved by a heuristic algorithm. The camera layout determined can cover the fullbody surface of an adult, with geometric errors of less than 5 mm. After arranging the cameras, they are calibrated using a mannequin before scanning real humans. The 16 RGB-D images are all captured within 1 s, which both avoids the need for the subject to attempt to remain still for an uncomfortable period, and helps to keep pose changes between different cameras small. All scans are combined and processed to reconstruct the photo-realistic textured mesh in 2 s. During both system calibration and working capture of a real subject, the high-quality RGB information is exploited to assist geometric reconstruction and texture stitching optimization

    Foot Depth Map Point Cloud Completion using Deep Learning with Residual Blocks

    Get PDF
    Fit is extremely important in footwear as fit largely determines performanceand comfort. Current footwear fit estimation mainly usesonly shoe size, which is extremely limited in characterizing theshape of a foot or the shape of a shoe. 3D scanning presents asolution to this, where a foot shape can be captured and virtuallyfit with shoe models. Traditional 3D scanning techniques have theirown complications however, stemming from their need to collectviews covering all aspects of an object. In this work we explore adeep learning technique to compete a foot scan point cloud frominformation contained in a single depth map view. We examine thebenefits of implementing residual blocks in architectures for this application,and find that they can improve accuracies while reducingmodel size and training time

    Global 3D non-rigid registration of deformable objects using a single RGB-D camera

    Get PDF
    We present a novel global non-rigid registration method for dynamic 3D objects. Our method allows objects to undergo large non-rigid deformations, and achieves high quality results even with substantial pose change or camera motion between views. In addition, our method does not require a template prior and uses less raw data than tracking based methods since only a sparse set of scans is needed. We compute the deformations of all the scans simultaneously by optimizing a global alignment problem to avoid the well-known loop closure problem, and use an as-rigid-as-possible constraint to eliminate the shrinkage problem of the deformed shapes, especially near open boundaries of scans. To cope with large-scale problems, we design a coarse-to-fine multi-resolution scheme, which also avoids the optimization being trapped into local minima. The proposed method is evaluated on public datasets and real datasets captured by an RGB-D sensor. Experimental results demonstrate that the proposed method obtains better results than several state-of-the-art methods

    Generative RGB-D face completion for head-mounted display removal

    Get PDF
    Head-mounted displays (HMDs) are an essential display device for the observation of virtual reality (VR) environments. However, HMDs obstruct external capturing methods from recording the user's upper face. This severely impacts social VR applications, such as teleconferencing, which commonly rely on external RGB-D sensors to capture a volumetric representation of the user. In this paper, we introduce an HMD removal framework based on generative adversarial networks (GANs), capable of jointly filling in missing color and depth data in RGB-D face images. Our framework includes an RGB-based identity loss function for identity preservation and several components aimed at surface reproduction. Our results demonstrate that our framework is able to remove HMDs from synthetic RGB-D face images while preserving the subject's identity

    A survey on human performance capture and animation

    Get PDF
    With the rapid development of computing technology, three-dimensional (3D) human body models and their dynamic motions are widely used in the digital entertainment industry. Human perfor- mance mainly involves human body shapes and motions. Key research problems include how to capture and analyze static geometric appearance and dynamic movement of human bodies, and how to simulate human body motions with physical e�ects. In this survey, according to main research directions of human body performance capture and animation, we summarize recent advances in key research topics, namely human body surface reconstruction, motion capture and synthesis, as well as physics-based motion sim- ulation, and further discuss future research problems and directions. We hope this will be helpful for readers to have a comprehensive understanding of human performance capture and animatio

    Deep Learning 3D Scans for Footwear Fit Estimation from a Single Depth Map

    Get PDF
    In clothing and particularly in footwear, the variance in the size and shape of people and of clothing poses a problem of how to match items of clothing to a person. This is specifically important in footwear, as fit is highly dependent on foot shape, which is not fully captured by shoe size. 3D scanning can be used to determine detailed personalized shape information, which can then be used to match against product shape for a more per- sonalized footwear matching experience. In current implementations however, this process is typically expensive and cumbersome. Typical scanning techniques require that a camera capture an object from many views in order to reconstruct shape. This usually requires either many cameras or a moving camera system, both of which being complex engineering tasks to construct. Ideally, in order to reduce the cost and complexity of scanning systems as much as possible, only a single image from a single camera would be needed. With recent techniques, semantics such as knowing the kind of object in view can be leveraged to determine the full 3D shape given incomplete information. Deep learning methods have been shown to be able to reconstruct 3D shape from limited inputs in highly symmetrical objects such as furniture and vehicles. We apply a deep learning approach to the domain of foot scanning, and present meth- ods to reconstruct a 3D point cloud from a single input depth map. Anthropomorphic body parts can be challenging due to their irregular shapes, difficulty for parameterizing and limited symmetries. We present two methods leveraging deep learning models to pro- duce complete foot scans from a single input depth map. We utilize 3D data from MPII Human Shape based on the CAESAR database, and train deep neural networks to learn anthropomorphic shape representations. Our first method attempts to complete the point cloud supplied by the input depth map by simply synthesizing the remaining information. We show that this method is capable of synthesizing the remainder of a point cloud with accuracies of 2.92±0.72 mm, and can be improved to accuracies of 2.55±0.75 mm when using an updated network architecture. Our second method fully synthesizes a complete point cloud foot scan from multiple virtual view points. We show that this method can produce foot scans with accuracies of 1.55±0.41 mm from a single input depth map. We performed additional experiments on real world foot scans captured using Kinect Fusion. We find that despite being trained only on a low resolution representation of foot shape, our models are able to recognize and synthesize reasonable complete point cloud scans. Our results suggest that our methods can be extended to work in the real world, with additional domain specific data