1,974 research outputs found
Recommended from our members
Deep learning for cardiac image segmentation: A review
Deep learning has become the most widely used approach for cardiac image segmentation in recent years. In this paper, we provide a review of over 100 cardiac image segmentation papers using deep learning, which covers common imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound (US) and major anatomical structures of interest (ventricles, atria and vessels). In addition, a summary of publicly available cardiac image datasets and code repositories are included to provide a base for encouraging reproducible research. Finally, we discuss the challenges and limitations with current deep learning-based approaches (scarcity of labels, model generalizability across different domains, interpretability) and suggest potential directions for future research
Deformable Shape Completion with Graph Convolutional Autoencoders
The availability of affordable and portable depth sensors has made scanning
objects and people simpler than ever. However, dealing with occlusions and
missing parts is still a significant challenge. The problem of reconstructing a
(possibly non-rigidly moving) 3D object from a single or multiple partial scans
has received increasing attention in recent years. In this work, we propose a
novel learning-based method for the completion of partial shapes. Unlike the
majority of existing approaches, our method focuses on objects that can undergo
non-rigid deformations. The core of our method is a variational autoencoder
with graph convolutional operations that learns a latent space for complete
realistic shapes. At inference, we optimize to find the representation in this
latent space that best fits the generated shape to the known partial input. The
completed shape exhibits a realistic appearance on the unknown part. We show
promising results towards the completion of synthetic and real scans of human
body and face meshes exhibiting different styles of articulation and
partiality.Comment: CVPR 201
Proprioceptive Robot Collision Detection through Gaussian Process Regression
This paper proposes a proprioceptive collision detection algorithm based on
Gaussian Regression. Compared to sensor-based collision detection and other
proprioceptive algorithms, the proposed approach has minimal sensing
requirements, since only the currents and the joint configurations are needed.
The algorithm extends the standard Gaussian Process models adopted in learning
the robot inverse dynamics, using a more rich set of input locations and an
ad-hoc kernel structure to model the complex and non-linear behaviors due to
frictions in quasi-static configurations. Tests performed on a Universal Robots
UR10 show the effectiveness of the proposed algorithm to detect when a
collision has occurred.Comment: Published at ACC 201
Improving Neural Radiance Fields with Depth-aware Optimization for Novel View Synthesis
With dense inputs, Neural Radiance Fields (NeRF) is able to render
photo-realistic novel views under static conditions. Although the synthesis
quality is excellent, existing NeRF-based methods fail to obtain moderate
three-dimensional (3D) structures. The novel view synthesis quality drops
dramatically given sparse input due to the implicitly reconstructed inaccurate
3D-scene structure. We propose SfMNeRF, a method to better synthesize novel
views as well as reconstruct the 3D-scene geometry. SfMNeRF leverages the
knowledge from the self-supervised depth estimation methods to constrain the
3D-scene geometry during view synthesis training. Specifically, SfMNeRF employs
the epipolar, photometric consistency, depth smoothness, and
position-of-matches constraints to explicitly reconstruct the 3D-scene
structure. Through these explicit constraints and the implicit constraint from
NeRF, our method improves the view synthesis as well as the 3D-scene geometry
performance of NeRF at the same time. In addition, SfMNeRF synthesizes novel
sub-pixels in which the ground truth is obtained by image interpolation. This
strategy enables SfMNeRF to include more samples to improve generalization
performance. Experiments on two public datasets demonstrate that SfMNeRF
surpasses state-of-the-art approaches. Code is available at
https://github.com/XTU-PR-LAB/SfMNeR
FML: Face Model Learning from Videos
Monocular image-based 3D reconstruction of faces is a long-standing problem
in computer vision. Since image data is a 2D projection of a 3D face, the
resulting depth ambiguity makes the problem ill-posed. Most existing methods
rely on data-driven priors that are built from limited 3D face scans. In
contrast, we propose multi-frame video-based self-supervised training of a deep
network that (i) learns a face identity model both in shape and appearance
while (ii) jointly learning to reconstruct 3D faces. Our face model is learned
using only corpora of in-the-wild video clips collected from the Internet. This
virtually endless source of training data enables learning of a highly general
3D face model. In order to achieve this, we propose a novel multi-frame
consistency loss that ensures consistent shape and appearance across multiple
frames of a subject's face, thus minimizing depth ambiguity. At test time we
can use an arbitrary number of frames, so that we can perform both monocular as
well as multi-frame reconstruction.Comment: CVPR 2019 (Oral). Video: https://www.youtube.com/watch?v=SG2BwxCw0lQ,
Project Page: https://gvv.mpi-inf.mpg.de/projects/FML19
- …