1,075 research outputs found
Efficient Online Surface Correction for Real-time Large-Scale 3D Reconstruction
State-of-the-art methods for large-scale 3D reconstruction from RGB-D sensors
usually reduce drift in camera tracking by globally optimizing the estimated
camera poses in real-time without simultaneously updating the reconstructed
surface on pose changes. We propose an efficient on-the-fly surface correction
method for globally consistent dense 3D reconstruction of large-scale scenes.
Our approach uses a dense Visual RGB-D SLAM system that estimates the camera
motion in real-time on a CPU and refines it in a global pose graph
optimization. Consecutive RGB-D frames are locally fused into keyframes, which
are incorporated into a sparse voxel hashed Signed Distance Field (SDF) on the
GPU. On pose graph updates, the SDF volume is corrected on-the-fly using a
novel keyframe re-integration strategy with reduced GPU-host streaming. We
demonstrate in an extensive quantitative evaluation that our method is up to
93% more runtime efficient compared to the state-of-the-art and requires
significantly less memory, with only negligible loss of surface quality.
Overall, our system requires only a single GPU and allows for real-time surface
correction of large environments.Comment: British Machine Vision Conference (BMVC), London, September 201
Efficient Online Surface Correction for Real-time Large-Scale 3D Reconstruction
State-of-the-art methods for large-scale 3D reconstruction from RGB-D sensors
usually reduce drift in camera tracking by globally optimizing the estimated
camera poses in real-time without simultaneously updating the reconstructed
surface on pose changes. We propose an efficient on-the-fly surface correction
method for globally consistent dense 3D reconstruction of large-scale scenes.
Our approach uses a dense Visual RGB-D SLAM system that estimates the camera
motion in real-time on a CPU and refines it in a global pose graph
optimization. Consecutive RGB-D frames are locally fused into keyframes, which
are incorporated into a sparse voxel hashed Signed Distance Field (SDF) on the
GPU. On pose graph updates, the SDF volume is corrected on-the-fly using a
novel keyframe re-integration strategy with reduced GPU-host streaming. We
demonstrate in an extensive quantitative evaluation that our method is up to
93% more runtime efficient compared to the state-of-the-art and requires
significantly less memory, with only negligible loss of surface quality.
Overall, our system requires only a single GPU and allows for real-time surface
correction of large environments.Comment: British Machine Vision Conference (BMVC), London, September 201
VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
We present the first real-time method to capture the full global 3D skeletal
pose of a human in a stable, temporally consistent manner using a single RGB
camera. Our method combines a new convolutional neural network (CNN) based pose
regressor with kinematic skeleton fitting. Our novel fully-convolutional pose
formulation regresses 2D and 3D joint positions jointly in real time and does
not require tightly cropped input frames. A real-time kinematic skeleton
fitting method uses the CNN output to yield temporally stable 3D global pose
reconstructions on the basis of a coherent kinematic skeleton. This makes our
approach the first monocular RGB method usable in real-time applications such
as 3D character control---thus far, the only monocular methods for such
applications employed specialized RGB-D cameras. Our method's accuracy is
quantitatively on par with the best offline 3D monocular RGB pose estimation
methods. Our results are qualitatively comparable to, and sometimes better
than, results from monocular RGB-D approaches, such as the Kinect. However, we
show that our approach is more broadly applicable than RGB-D solutions, i.e. it
works for outdoor scenes, community videos, and low quality commodity RGB
cameras.Comment: Accepted to SIGGRAPH 201
Depth sensors in augmented reality solutions. Literature review
The emergence of depth sensors has made it possible to track – not only monocular
cues – but also the actual depth values of the environment. This is especially
useful in augmented reality solutions, where the position and orientation (pose) of
the observer need to be accurately determined. This allows virtual objects to be
installed to the view of the user through, for example, a screen of a tablet or augmented
reality glasses (e.g. Google glass, etc.). Although the early 3D sensors have
been physically quite large, the size of these sensors is decreasing, and possibly –
eventually – a 3D sensor could be embedded – for example – to augmented reality
glasses. The wider subject area considered in this review is 3D SLAM methods,
which take advantage of the 3D information available by modern RGB-D sensors,
such as Microsoft Kinect. Thus the review for SLAM (Simultaneous Localization
and Mapping) and 3D tracking in augmented reality is a timely subject. We also try
to find out the limitations and possibilities of different tracking methods, and how
they should be improved, in order to allow efficient integration of the methods to
the augmented reality solutions of the future.Siirretty Doriast
Depth sensors in augmented reality solutions. Literature review
The emergence of depth sensors has made it possible to track – not only monocular
cues – but also the actual depth values of the environment. This is especially
useful in augmented reality solutions, where the position and orientation (pose) of
the observer need to be accurately determined. This allows virtual objects to be
installed to the view of the user through, for example, a screen of a tablet or augmented
reality glasses (e.g. Google glass, etc.). Although the early 3D sensors have
been physically quite large, the size of these sensors is decreasing, and possibly –
eventually – a 3D sensor could be embedded – for example – to augmented reality
glasses. The wider subject area considered in this review is 3D SLAM methods,
which take advantage of the 3D information available by modern RGB-D sensors,
such as Microsoft Kinect. Thus the review for SLAM (Simultaneous Localization
and Mapping) and 3D tracking in augmented reality is a timely subject. We also try
to find out the limitations and possibilities of different tracking methods, and how
they should be improved, in order to allow efficient integration of the methods to
the augmented reality solutions of the future.Siirretty Doriast
Mining Spatial-Temporal Patterns and Structural Sparsity for Human Motion Data Denoising
Motion capture is an important technique with a wide range of applications in areas such as computer vision, computer animation, film production, and medical rehabilitation. Even with the professional motion capture systems, the acquired raw data mostly contain inevitable noises and outliers. To denoise the data, numerous methods have been developed, while this problem still remains a challenge due to the high complexity of human motion and the diversity of real-life situations. In this paper, we propose a data-driven-based robust human motion denoising approach by mining the spatial-temporal patterns and the structural sparsity embedded in motion data. We first replace the regularly used entire pose model with a much fine-grained partlet model as feature representation to exploit the abundant local body part posture and movement similarities. Then, a robust dictionary learning algorithm is proposed to learn multiple compact and representative motion dictionaries from the training data in parallel. Finally, we reformulate the human motion denoising problem as a robust structured sparse coding problem in which both the noise distribution information and the temporal smoothness property of human motion have been jointly taken into account. Compared with several state-of-the-art motion denoising methods on both the synthetic and real noisy motion data, our method consistently yields better performance than its counterparts. The outputs of our approach are much more stable than that of the others. In addition, it is much easier to setup the training dataset of our method than that of the other data-driven-based methods
3D scanning of cultural heritage with consumer depth cameras
Three dimensional reconstruction of cultural heritage objects is an expensive and time-consuming process. Recent consumer real-time depth acquisition devices, like Microsoft Kinect, allow very fast and simple acquisition of 3D views. However 3D scanning with such devices is a challenging task due to the limited accuracy and reliability of the acquired data. This paper introduces a 3D reconstruction pipeline suited to use consumer depth cameras as hand-held scanners for cultural heritage objects. Several new contributions have been made to achieve this result. They include an ad-hoc filtering scheme that exploits the model of the error on the acquired data and a novel algorithm for the extraction of salient points exploiting both depth and color data. Then the salient points are used within a modified version of the ICP algorithm that exploits both geometry and color distances to precisely align the views even when geometry information is not sufficient to constrain the registration. The proposed method, although applicable to generic scenes, has been tuned to the acquisition of sculptures and in this connection its performance is rather interesting as the experimental results indicate
- …