18,338 research outputs found
DepthCut: Improved Depth Edge Estimation Using Multiple Unreliable Channels
In the context of scene understanding, a variety of methods exists to
estimate different information channels from mono or stereo images, including
disparity, depth, and normals. Although several advances have been reported in
the recent years for these tasks, the estimated information is often imprecise
particularly near depth discontinuities or creases. Studies have however shown
that precisely such depth edges carry critical cues for the perception of
shape, and play important roles in tasks like depth-based segmentation or
foreground selection. Unfortunately, the currently extracted channels often
carry conflicting signals, making it difficult for subsequent applications to
effectively use them. In this paper, we focus on the problem of obtaining
high-precision depth edges (i.e., depth contours and creases) by jointly
analyzing such unreliable information channels. We propose DepthCut, a
data-driven fusion of the channels using a convolutional neural network trained
on a large dataset with known depth. The resulting depth edges can be used for
segmentation, decomposing a scene into depth layers with relatively flat depth,
or improving the accuracy of the depth estimate near depth edges by
constraining its gradients to agree with these edges. Quantitatively, we
compare against 15 variants of baselines and demonstrate that our depth edges
result in an improved segmentation performance and an improved depth estimate
near depth edges compared to data-agnostic channel fusion. Qualitatively, we
demonstrate that the depth edges result in superior segmentation and depth
orderings.Comment: 12 page
Explainable cardiac pathology classification on cine MRI with motion characterization by semi-supervised learning of apparent flow
We propose a method to classify cardiac pathology based on a novel approach
to extract image derived features to characterize the shape and motion of the
heart. An original semi-supervised learning procedure, which makes efficient
use of a large amount of non-segmented images and a small amount of images
segmented manually by experts, is developed to generate pixel-wise apparent
flow between two time points of a 2D+t cine MRI image sequence. Combining the
apparent flow maps and cardiac segmentation masks, we obtain a local apparent
flow corresponding to the 2D motion of myocardium and ventricular cavities.
This leads to the generation of time series of the radius and thickness of
myocardial segments to represent cardiac motion. These time series of motion
features are reliable and explainable characteristics of pathological cardiac
motion. Furthermore, they are combined with shape-related features to classify
cardiac pathologies. Using only nine feature values as input, we propose an
explainable, simple and flexible model for pathology classification. On ACDC
training set and testing set, the model achieves 95% and 94% respectively as
classification accuracy. Its performance is hence comparable to that of the
state-of-the-art. Comparison with various other models is performed to outline
some advantages of our model
Semi-Global Stereo Matching with Surface Orientation Priors
Semi-Global Matching (SGM) is a widely-used efficient stereo matching
technique. It works well for textured scenes, but fails on untextured slanted
surfaces due to its fronto-parallel smoothness assumption. To remedy this
problem, we propose a simple extension, termed SGM-P, to utilize precomputed
surface orientation priors. Such priors favor different surface slants in
different 2D image regions or 3D scene regions and can be derived in various
ways. In this paper we evaluate plane orientation priors derived from stereo
matching at a coarser resolution and show that such priors can yield
significant performance gains for difficult weakly-textured scenes. We also
explore surface normal priors derived from Manhattan-world assumptions, and we
analyze the potential performance gains using oracle priors derived from
ground-truth data. SGM-P only adds a minor computational overhead to SGM and is
an attractive alternative to more complex methods employing higher-order
smoothness terms.Comment: extended draft of 3DV 2017 (spotlight) pape
Scalable Full Flow with Learned Binary Descriptors
We propose a method for large displacement optical flow in which local
matching costs are learned by a convolutional neural network (CNN) and a
smoothness prior is imposed by a conditional random field (CRF). We tackle the
computation- and memory-intensive operations on the 4D cost volume by a
min-projection which reduces memory complexity from quadratic to linear and
binary descriptors for efficient matching. This enables evaluation of the cost
on the fly and allows to perform learning and CRF inference on high resolution
images without ever storing the 4D cost volume. To address the problem of
learning binary descriptors we propose a new hybrid learning scheme. In
contrast to current state of the art approaches for learning binary CNNs we can
compute the exact non-zero gradient within our model. We compare several
methods for training binary descriptors and show results on public available
benchmarks.Comment: GCPR 201
Learning to Predict Image-based Rendering Artifacts with Respect to a Hidden Reference Image
Image metrics predict the perceived per-pixel difference between a reference
image and its degraded (e. g., re-rendered) version. In several important
applications, the reference image is not available and image metrics cannot be
applied. We devise a neural network architecture and training procedure that
allows predicting the MSE, SSIM or VGG16 image difference from the distorted
image alone while the reference is not observed. This is enabled by two
insights: The first is to inject sufficiently many un-distorted natural image
patches, which can be found in arbitrary amounts and are known to have no
perceivable difference to themselves. This avoids false positives. The second
is to balance the learning, where it is carefully made sure that all image
errors are equally likely, avoiding false negatives. Surprisingly, we observe,
that the resulting no-reference metric, subjectively, can even perform better
than the reference-based one, as it had to become robust against
mis-alignments. We evaluate the effectiveness of our approach in an image-based
rendering context, both quantitatively and qualitatively. Finally, we
demonstrate two applications which reduce light field capture time and provide
guidance for interactive depth adjustment.Comment: 13 pages, 11 figure
Guided Stereo Matching
Stereo is a prominent technique to infer dense depth maps from images, and
deep learning further pushed forward the state-of-the-art, making end-to-end
architectures unrivaled when enough data is available for training. However,
deep networks suffer from significant drops in accuracy when dealing with new
environments. Therefore, in this paper, we introduce Guided Stereo Matching, a
novel paradigm leveraging a small amount of sparse, yet reliable depth
measurements retrieved from an external source enabling to ameliorate this
weakness. The additional sparse cues required by our method can be obtained
with any strategy (e.g., a LiDAR) and used to enhance features linked to
corresponding disparity hypotheses. Our formulation is general and fully
differentiable, thus enabling to exploit the additional sparse inputs in
pre-trained deep stereo networks as well as for training a new instance from
scratch. Extensive experiments on three standard datasets and two
state-of-the-art deep architectures show that even with a small set of sparse
input cues, i) the proposed paradigm enables significant improvements to
pre-trained networks. Moreover, ii) training from scratch notably increases
accuracy and robustness to domain shifts. Finally, iii) it is suited and
effective even with traditional stereo algorithms such as SGM.Comment: CVPR 201
- …