1,241 research outputs found

    General Dynamic Scene Reconstruction from Multiple View Video

    Get PDF
    This paper introduces a general approach to dynamic scene reconstruction from multiple moving cameras without prior knowledge or limiting constraints on the scene structure, appearance, or illumination. Existing techniques for dynamic scene reconstruction from multiple wide-baseline camera views primarily focus on accurate reconstruction in controlled environments, where the cameras are fixed and calibrated and background is known. These approaches are not robust for general dynamic scenes captured with sparse moving cameras. Previous approaches for outdoor dynamic scene reconstruction assume prior knowledge of the static background appearance and structure. The primary contributions of this paper are twofold: an automatic method for initial coarse dynamic scene segmentation and reconstruction without prior knowledge of background appearance or structure; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes from multiple wide-baseline static or moving cameras. Evaluation is performed on a variety of indoor and outdoor scenes with cluttered backgrounds and multiple dynamic non-rigid objects such as people. Comparison with state-of-the-art approaches demonstrates improved accuracy in both multiple view segmentation and dense reconstruction. The proposed approach also eliminates the requirement for prior knowledge of scene structure and appearance

    Detail-preserving and Content-aware Variational Multi-view Stereo Reconstruction

    Full text link
    Accurate recovery of 3D geometrical surfaces from calibrated 2D multi-view images is a fundamental yet active research area in computer vision. Despite the steady progress in multi-view stereo reconstruction, most existing methods are still limited in recovering fine-scale details and sharp features while suppressing noises, and may fail in reconstructing regions with few textures. To address these limitations, this paper presents a Detail-preserving and Content-aware Variational (DCV) multi-view stereo method, which reconstructs the 3D surface by alternating between reprojection error minimization and mesh denoising. In reprojection error minimization, we propose a novel inter-image similarity measure, which is effective to preserve fine-scale details of the reconstructed surface and builds a connection between guided image filtering and image registration. In mesh denoising, we propose a content-aware p\ell_{p}-minimization algorithm by adaptively estimating the pp value and regularization parameters based on the current input. It is much more promising in suppressing noise while preserving sharp features than conventional isotropic mesh smoothing. Experimental results on benchmark datasets demonstrate that our DCV method is capable of recovering more surface details, and obtains cleaner and more accurate reconstructions than state-of-the-art methods. In particular, our method achieves the best results among all published methods on the Middlebury dino ring and dino sparse ring datasets in terms of both completeness and accuracy.Comment: 14 pages,16 figures. Submitted to IEEE Transaction on image processin

    Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling

    Full text link
    We study 3D shape modeling from a single image and make contributions to it in three aspects. First, we present Pix3D, a large-scale benchmark of diverse image-shape pairs with pixel-level 2D-3D alignment. Pix3D has wide applications in shape-related tasks including reconstruction, retrieval, viewpoint estimation, etc. Building such a large-scale dataset, however, is highly challenging; existing datasets either contain only synthetic data, or lack precise alignment between 2D images and 3D shapes, or only have a small number of images. Second, we calibrate the evaluation criteria for 3D shape reconstruction through behavioral studies, and use them to objectively and systematically benchmark cutting-edge reconstruction algorithms on Pix3D. Third, we design a novel model that simultaneously performs 3D reconstruction and pose estimation; our multi-task learning approach achieves state-of-the-art performance on both tasks.Comment: CVPR 2018. The first two authors contributed equally to this work. Project page: http://pix3d.csail.mit.ed

    Lidar-based Gait Analysis and Activity Recognition in a 4D Surveillance System

    Get PDF
    This paper presents new approaches for gait and activity analysis based on data streams of a Rotating Multi Beam (RMB) Lidar sensor. The proposed algorithms are embedded into an integrated 4D vision and visualization system, which is able to analyze and interactively display real scenarios in natural outdoor environments with walking pedestrians. The main focus of the investigations are gait based person re-identification during tracking, and recognition of specific activity patterns such as bending, waving, making phone calls and checking the time looking at wristwatches. The descriptors for training and recognition are observed and extracted from realistic outdoor surveillance scenarios, where multiple pedestrians are walking in the field of interest following possibly intersecting trajectories, thus the observations might often be affected by occlusions or background noise. Since there is no public database available for such scenarios, we created and published a new Lidar-based outdoors gait and activity dataset on our website, that contains point cloud sequences of 28 different persons extracted and aggregated from 35 minutes-long measurements. The presented results confirm that both efficient gait-based identification and activity recognition is achievable in the sparse point clouds of a single RMB Lidar sensor. After extracting the people trajectories, we synthesized a free-viewpoint video, where moving avatar models follow the trajectories of the observed pedestrians in real time, ensuring that the leg movements of the animated avatars are synchronized with the real gait cycles observed in the Lidar stream

    Temporally Coherent General Dynamic Scene Reconstruction

    Get PDF
    Existing techniques for dynamic scene reconstruction from multiple wide-baseline cameras primarily focus on reconstruction in controlled environments, with fixed calibrated cameras and strong prior constraints. This paper introduces a general approach to obtain a 4D representation of complex dynamic scenes from multi-view wide-baseline static or moving cameras without prior knowledge of the scene structure, appearance, or illumination. Contributions of the work are: An automatic method for initial coarse reconstruction to initialize joint estimation; Sparse-to-dense temporal correspondence integrated with joint multi-view segmentation and reconstruction to introduce temporal coherence; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes by introducing shape constraint. Comparison with state-of-the-art approaches on a variety of complex indoor and outdoor scenes, demonstrates improved accuracy in both multi-view segmentation and dense reconstruction. This paper demonstrates unsupervised reconstruction of complete temporally coherent 4D scene models with improved non-rigid object segmentation and shape reconstruction and its application to free-viewpoint rendering and virtual reality.Comment: Submitted to IJCV 2019. arXiv admin note: substantial text overlap with arXiv:1603.0338

    Active modelling of virtual humans

    Get PDF
    This thesis provides a complete framework that enables the creation of photorealistic 3D human models in real-world environments. The approach allows a non-expert user to use any digital capture device to obtain four images of an individual and create a personalised 3D model, for multimedia applications. To achieve this, it is necessary that the system is automatic and that the reconstruction process is flexible to account for information that is not available or incorrectly captured. In this approach the individual is automatically extracted from the environment using constrained active B-spline templates that are scaled and automatically initialised using only image information. These templates incorporate the energy minimising framework for Active Contour Models, providing a suitable and flexible method to deal with the adjustments in pose an individual can adopt. The final states of the templates describe the individual’s shape. The contours in each view are combined to form a 3D B-spline surface that characterises an individual’s maximal silhouette equivalent. The surface provides a mould that contains sufficient information to allow for the active deformation of an underlying generic human model. This modelling approach is performed using a novel technique that evolves active-meshes to 3D for deforming the underlying human model, while adaptively constraining it to preserve its existing structure. The active-mesh approach incorporates internal constraints that maintain the structural relationship of the vertices of the human model, while external forces deform the model congruous to the 3D surface mould. The strength of the internal constraints can be reduced to allow the model to adopt the exact shape of the bounding volume or strengthened to preserve the internal structure, particularly in areas of high detail. This novel implementation provides a uniform framework that can be simply and automatically applied to the entire human model
    corecore