2 research outputs found

    Canonical 3D Deformer Maps: Unifying parametric and non-parametric methods for dense weakly-supervised category reconstruction

    Get PDF
    We propose the Canonical 3D Deformer Map, a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects. Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings, combining their individual advantages. In particular, it learns to associate each image pixel with a deformation model of the corresponding 3D object point which is canonical, i.e. intrinsic to the identity of the point and shared across objects of the category. The result is a method that, given only sparse 2D supervision at training time, can, at test time, reconstruct the 3D shape and texture of objects from single views, while establishing meaningful dense correspondences between object instances. It also achieves state-of-the-art results in dense 3D reconstruction on public in-the-wild datasets of faces, cars, and birds.Comment: Published at NeurIPS 202

    Capturing the geometry of object categories from video supervision

    No full text
    In this article, we are interested in capturing the 3D geometry of object categories simply by looking around them. Our unsupervised method fundamentally departs from traditional approaches that require either CAD models or manual supervision. It only uses video sequences capturing a handful of instances of an object category to train a deep architecture tailored for extracting 3D geometry predictions. Our deep architecture has three components. First, a Siamese viewpoint factorization network robustly aligns the input videos and, as a consequence, learns to predict the absolute category-specific viewpoint from a single image depicting any previously unseen instance of that category. Second, a depth estimation network performs monocular depth prediction. Finally, a 3D shape completion network predicts the full shape of the depicted object instance by re-using the output of the monocular depth prediction module. We also propose a way to configure networks so they can perform probabilistic predictions. We demonstrate that, properly used in our framework, this self-assessment mechanism is crucial for obtaining high quality predictions. Our network achieves state-of-the-art results on viewpoint prediction, depth estimation, and 3D point cloud estimation on public benchmarks
    corecore