69 research outputs found
Co-Fusion: Real-time Segmentation, Tracking and Fusion of Multiple Objects
In this paper we introduce Co-Fusion, a dense SLAM system that takes a live
stream of RGB-D images as input and segments the scene into different objects
(using either motion or semantic cues) while simultaneously tracking and
reconstructing their 3D shape in real time. We use a multiple model fitting
approach where each object can move independently from the background and still
be effectively tracked and its shape fused over time using only the information
from pixels associated with that object label. Previous attempts to deal with
dynamic scenes have typically considered moving regions as outliers, and
consequently do not model their shape or track their motion over time. In
contrast, we enable the robot to maintain 3D models for each of the segmented
objects and to improve them over time through fusion. As a result, our system
can enable a robot to maintain a scene description at the object level which
has the potential to allow interactions with its working environment; even in
the case of dynamic scenes.Comment: International Conference on Robotics and Automation (ICRA) 2017,
http://visual.cs.ucl.ac.uk/pubs/cofusion,
https://github.com/martinruenz/co-fusio
Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image
We propose a unified formulation for the problem of 3D human pose estimation
from a single raw RGB image that reasons jointly about 2D joint estimation and
3D pose reconstruction to improve both tasks. We take an integrated approach
that fuses probabilistic knowledge of 3D human pose with a multi-stage CNN
architecture and uses the knowledge of plausible 3D landmark locations to
refine the search for better 2D locations. The entire process is trained
end-to-end, is extremely efficient and obtains state- of-the-art results on
Human3.6M outperforming previous approaches both on 2D and 3D errors.Comment: Paper presented at CVPR 1
DynamicSurf: Dynamic Neural RGB-D Surface Reconstruction with an Optimizable Feature Grid
We propose DynamicSurf, a model-free neural implicit surface reconstruction
method for high-fidelity 3D modelling of non-rigid surfaces from monocular
RGB-D video. To cope with the lack of multi-view cues in monocular sequences of
deforming surfaces, one of the most challenging settings for 3D reconstruction,
DynamicSurf exploits depth, surface normals, and RGB losses to improve
reconstruction fidelity and optimisation time. DynamicSurf learns a neural
deformation field that maps a canonical representation of the surface geometry
to the current frame. We depart from current neural non-rigid surface
reconstruction models by designing the canonical representation as a learned
feature grid which leads to faster and more accurate surface reconstruction
than competing approaches that use a single MLP. We demonstrate DynamicSurf on
public datasets and show that it can optimize sequences of varying frames with
speedup over pure MLP-based approaches while achieving comparable
results to the state-of-the-art methods. Project is available at
https://mirgahney.github.io//DynamicSurf.io/
GNPM: Geometric-Aware Neural Parametric Models
We propose Geometric Neural Parametric Models (GNPM), a learned parametric
model that takes into account the local structure of data to learn disentangled
shape and pose latent spaces of 4D dynamics, using a geometric-aware
architecture on point clouds. Temporally consistent 3D deformations are
estimated without the need for dense correspondences at training time, by
exploiting cycle consistency. Besides its ability to learn dense
correspondences, GNPMs also enable latent-space manipulations such as
interpolation and shape/pose transfer. We evaluate GNPMs on various datasets of
clothed humans, and show that it achieves comparable performance to
state-of-the-art methods that require dense correspondences during training.Comment: 10 pages, 8 figure
- …