11 research outputs found

    SOME REMARKS ON THE SMARANDACHE FUNCTION

    Get PDF
    Remarks on a Function in the Number Theory

    Structured 3D Features for Reconstructing Controllable Avatars

    Full text link
    We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface. The 3D points have associated semantics and can move freely in 3D space. This allows for optimal coverage of the person of interest, beyond just the body shape, which in turn, additionally helps modeling accessories, hair, and loose clothing. Owing to this, we present a complete 3D transformer-based attention framework which, given a single image of a person in an unconstrained pose, generates an animatable 3D reconstruction with albedo and illumination decomposition, as a result of a single end-to-end model, trained semi-supervised, and with no additional postprocessing. We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation. Moreover, we show that the proposed methodology allows novel view synthesis, relighting, and re-posing the reconstruction, and can naturally be extended to handle multiple input images (e.g. different views of a person, or the same view, in different poses, in video). Finally, we demonstrate the editing capabilities of our model for 3D virtual try-on applications.Comment: Accepted at CVPR 2023. Project page: https://enriccorona.github.io/s3f/, Video: https://www.youtube.com/watch?v=mcZGcQ6L-2

    Large displacement 3D scene flow with occlusion reasoning

    No full text
    The emergence of modern, affordable and accurate RGB-D sensors increases the need for single view approaches to estimate 3-dimensional motion, also known as scene flow. In this paper we propose a coarse-to-fine, dense, correspondence-based scene flow formulation that relies on explicit geometric reasoning to account for the effects of large displacements and to model occlusion. Our methodology enforces local motion rigidity at the level of the 3d point cloud without explicitly smoothing the parameters of adjacent neighborhoods. By integrating all geometric and photometric components in a single, consistent, occlusion-aware energy model, defined over overlapping, image-adaptive neighborhoods, our method can process fast motions and large occlusions areas, as present in challenging datasets like the MPI Sintel Flow Dataset, recently augmented with depth information. By explicitly modeling large displacements and occlusion, we can handle difficult sequences which cannot be currently processed by state of the art scene flow methods. We also show that by integrating depth information into the model, we can obtain correspondence fields with improved spatial support and sharper boundaries compared to the state of the art, large-displacement optical flow methods

    Deep Learning of Graph Matching

    No full text
    The problem of graph matching under node and pairwise constraints is fundamental in areas as diverse as combinatorial optimization, machine learning or computer vision, where representing both the relations between nodes and their neighborhood structure is essential. We present an end-to-end model that makes it possible to learn all parameters of the graph matching process, including the unary and pairwise node neighborhoods, represented as deep feature extraction hierarchies. The challenge is in the formulation of the different matrix computation layers of the model in a way that enables the consistent, efficient propagation of gradients in the complete pipeline from the loss function, through the combinatorial optimization layer solving the matching problem, and the feature extraction hierarchy. Our computer vision experiments and ablation studies on challenging datasets like PASCAL VOC keypoints, Sintel and CUB show that matching models refined end-to-end are superior to counterparts based on feature hierarchies trained for other problems

    Deep network for the integrated 3D sensing of multiple people in natural images

    No full text
    We present MubyNet - a feed-forward, multitask, bottom up system for the integrated localization, as well as 3d pose and shape estimation, of multiple people in monocular images. The challenge is the formal modeling of the problem that intrinsically requires discrete and continuous computation, e.g. grouping people vs. predicting 3d pose. The model identifies human body structures (joints and limbs) in images, groups them based on 2d and 3d information fused using learned scoring functions, and optimally aggregates such responses into partial or complete 3d human skeleton hypotheses under kinematic tree constraints, but without knowing in advance the number of people in the scene and their visibility relations. We design a multi-task deep neural network with differentiable stages where the person grouping problem is formulated as an integer program based on learned body part scores parameterized by both 2d and 3d information. This avoids suboptimality resulting from separate 2d and 3d reasoning, with grouping performed based on the combined representation. The final stage of 3d pose and shape prediction is based on a learned attention process where information from different human body parts is optimally integrated. State-of-the-art results are obtained in large scale datasets like Human3.6M and Panoptic, and qualitatively by reconstructing the 3d shape and pose of multiple people, under occlusion, in difficult monocular images

    Semi-supervised learning and optimization for hypergraph matching

    No full text
    Graph and hypergraph matching are important problems in computer vision. They are successfully used in many applications requiring 2D or 3D feature matching, such as 3D reconstruction and object recognition. While graph matching is limited to using pairwise relationships, hypergraph matching permits the use of relationships between sets of features of any order. Consequently, it carries the promise to make matching more robust to changes in scale, deformations and outliers. In this paper we make two contributions. First, we present a first semi-supervised algorithm for learning the parameters that control the hypergraph matching model and demonstrate experimentally that it significantly improves the performance of current state-of-the-art methods. Second, we propose a novel efficient hypergraph matching algorithm, which outperforms the state-of-the-art, and, when used in combination with other higher-order matching algorithms, it consistently improves their performance. 1
    corecore