1,521 research outputs found

    Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis

    Full text link
    We introduce a data-driven approach to complete partial 3D shapes through a combination of volumetric deep neural networks and 3D shape synthesis. From a partially-scanned input shape, our method first infers a low-resolution -- but complete -- output. To this end, we introduce a 3D-Encoder-Predictor Network (3D-EPN) which is composed of 3D convolutional layers. The network is trained to predict and fill in missing data, and operates on an implicit surface representation that encodes both known and unknown space. This allows us to predict global structure in unknown areas at high accuracy. We then correlate these intermediary results with 3D geometry from a shape database at test time. In a final pass, we propose a patch-based 3D shape synthesis method that imposes the 3D geometry from these retrieved shapes as constraints on the coarsely-completed mesh. This synthesis process enables us to reconstruct fine-scale detail and generate high-resolution output while respecting the global mesh structure obtained by the 3D-EPN. Although our 3D-EPN outperforms state-of-the-art completion method, the main contribution in our work lies in the combination of a data-driven shape predictor and analytic 3D shape synthesis. In our results, we show extensive evaluations on a newly-introduced shape completion benchmark for both real-world and synthetic data

    Live User-guided Intrinsic Video For Static Scenes

    Get PDF
    We present a novel real-time approach for user-guided intrinsic decomposition of static scenes captured by an RGB-D sensor. In the first step, we acquire a three-dimensional representation of the scene using a dense volumetric reconstruction framework. The obtained reconstruction serves as a proxy to densely fuse reflectance estimates and to store user-provided constraints in three-dimensional space. User constraints, in the form of constant shading and reflectance strokes, can be placed directly on the real-world geometry using an intuitive touch-based interaction metaphor, or using interactive mouse strokes. Fusing the decomposition results and constraints in three-dimensional space allows for robust propagation of this information to novel views by re-projection.We leverage this information to improve on the decomposition quality of existing intrinsic video decomposition techniques by further constraining the ill-posed decomposition problem. In addition to improved decomposition quality, we show a variety of live augmented reality applications such as recoloring of objects, relighting of scenes and editing of material appearance

    Foundry: Hierarchical Material Design for Multi-Material Fabrication

    Get PDF
    We demonstrate a new approach for designing functional material definitions for multi-material fabrication using our system called Foundry. Foundry provides an interactive and visual process for hierarchically designing spatially-varying material properties (e.g., appearance, mechanical, optical). The resulting meta-materials exhibit structure at the micro and macro level and can surpass the qualities of traditional composites. The material definitions are created by composing a set of operators into an operator graph. Each operator performs a volume decomposition operation, remaps space, or constructs and assigns a material composition. The operators are implemented using a domain-specific language for multi-material fabrication; users can easily extend the library by writing their own operators. Foundry can be used to build operator graphs that describe complex, parameterized, resolution-independent, and reusable material definitions. We also describe how to stage the evaluation of the final material definition which in conjunction with progressive refinement, allows for interactive material evaluation even for complex designs. We show sophisticated and functional parts designed with our system.National Science Foundation (U.S.) (1138967)National Science Foundation (U.S.) (1409310)National Science Foundation (U.S.) (1547088)National Science Foundation (U.S.). Graduate Research Fellowship ProgramMassachusetts Institute of Technology. Undergraduate Research Opportunities Progra

    FastHuman: Reconstructing High-Quality Clothed Human in Minutes

    Full text link
    We propose an approach for optimizing high-quality clothed human body shapes in minutes, using multi-view posed images. While traditional neural rendering methods struggle to disentangle geometry and appearance using only rendering loss, and are computationally intensive, our method uses a mesh-based patch warping technique to ensure multi-view photometric consistency, and sphere harmonics (SH) illumination to refine geometric details efficiently. We employ oriented point clouds' shape representation and SH shading, which significantly reduces optimization and rendering times compared to implicit methods. Our approach has demonstrated promising results on both synthetic and real-world datasets, making it an effective solution for rapidly generating high-quality human body shapes. Project page \href{https://l1346792580123.github.io/nccsfs/}{https://l1346792580123.github.io/nccsfs/}Comment: International Conference on 3D Vision, 3DV 202

    Towards recovery of complex shapes in meshes using digital images for reverse engineering applications

    Get PDF
    When an object owns complex shapes, or when its outer surfaces are simply inaccessible, some of its parts may not be captured during its reverse engineering. These deficiencies in the point cloud result in a set of holes in the reconstructed mesh. This paper deals with the use of information extracted from digital images to recover missing areas of a physical object. The proposed algorithm fills in these holes by solving an optimization problem that combines two kinds of information: (1) the geometric information available on the surrounding of the holes, (2) the information contained in an image of the real object. The constraints come from the image irradiance equation, a first-order non-linear partial differential equation that links the position of the mesh vertices to the light intensity of the image pixels. The blending conditions are satisfied by using an objective function based on a mechanical model of bar network that simulates the curvature evolution over the mesh. The inherent shortcomings both to the current holefilling algorithms and the resolution of the image irradiance equations are overcom

    Semantic 3D Reconstruction with Finite Element Bases

    Full text link
    We propose a novel framework for the discretisation of multi-label problems on arbitrary, continuous domains. Our work bridges the gap between general FEM discretisations, and labeling problems that arise in a variety of computer vision tasks, including for instance those derived from the generalised Potts model. Starting from the popular formulation of labeling as a convex relaxation by functional lifting, we show that FEM discretisation is valid for the most general case, where the regulariser is anisotropic and non-metric. While our findings are generic and applicable to different vision problems, we demonstrate their practical implementation in the context of semantic 3D reconstruction, where such regularisers have proved particularly beneficial. The proposed FEM approach leads to a smaller memory footprint as well as faster computation, and it constitutes a very simple way to enable variable, adaptive resolution within the same model

    Structured 3D Features for Reconstructing Controllable Avatars

    Full text link
    We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface. The 3D points have associated semantics and can move freely in 3D space. This allows for optimal coverage of the person of interest, beyond just the body shape, which in turn, additionally helps modeling accessories, hair, and loose clothing. Owing to this, we present a complete 3D transformer-based attention framework which, given a single image of a person in an unconstrained pose, generates an animatable 3D reconstruction with albedo and illumination decomposition, as a result of a single end-to-end model, trained semi-supervised, and with no additional postprocessing. We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation. Moreover, we show that the proposed methodology allows novel view synthesis, relighting, and re-posing the reconstruction, and can naturally be extended to handle multiple input images (e.g. different views of a person, or the same view, in different poses, in video). Finally, we demonstrate the editing capabilities of our model for 3D virtual try-on applications.Comment: Accepted at CVPR 2023. Project page: https://enriccorona.github.io/s3f/, Video: https://www.youtube.com/watch?v=mcZGcQ6L-2
    corecore