8,901 research outputs found
Depth from Monocular Images using a Semi-Parallel Deep Neural Network (SPDNN) Hybrid Architecture
Deep neural networks are applied to a wide range of problems in recent years.
In this work, Convolutional Neural Network (CNN) is applied to the problem of
determining the depth from a single camera image (monocular depth). Eight
different networks are designed to perform depth estimation, each of them
suitable for a feature level. Networks with different pooling sizes determine
different feature levels. After designing a set of networks, these models may
be combined into a single network topology using graph optimization techniques.
This "Semi Parallel Deep Neural Network (SPDNN)" eliminates duplicated common
network layers, and can be further optimized by retraining to achieve an
improved model compared to the individual topologies. In this study, four SPDNN
models are trained and have been evaluated at 2 stages on the KITTI dataset.
The ground truth images in the first part of the experiment are provided by the
benchmark, and for the second part, the ground truth images are the depth map
results from applying a state-of-the-art stereo matching method. The results of
this evaluation demonstrate that using post-processing techniques to refine the
target of the network increases the accuracy of depth estimation on individual
mono images. The second evaluation shows that using segmentation data alongside
the original data as the input can improve the depth estimation results to a
point where performance is comparable with stereo depth estimation. The
computational time is also discussed in this study.Comment: 44 pages, 25 figure
Introducing Geometry in Active Learning for Image Segmentation
We propose an Active Learning approach to training a segmentation classifier
that exploits geometric priors to streamline the annotation process in 3D image
volumes. To this end, we use these priors not only to select voxels most in
need of annotation but to guarantee that they lie on 2D planar patch, which
makes it much easier to annotate than if they were randomly distributed in the
volume. A simplified version of this approach is effective in natural 2D
images. We evaluated our approach on Electron Microscopy and Magnetic Resonance
image volumes, as well as on natural images. Comparing our approach against
several accepted baselines demonstrates a marked performance increase
Multiclass Data Segmentation using Diffuse Interface Methods on Graphs
We present two graph-based algorithms for multiclass segmentation of
high-dimensional data. The algorithms use a diffuse interface model based on
the Ginzburg-Landau functional, related to total variation compressed sensing
and image processing. A multiclass extension is introduced using the Gibbs
simplex, with the functional's double-well potential modified to handle the
multiclass case. The first algorithm minimizes the functional using a convex
splitting numerical scheme. The second algorithm is a uses a graph adaptation
of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates
between diffusion and thresholding. We demonstrate the performance of both
algorithms experimentally on synthetic data, grayscale and color images, and
several benchmark data sets such as MNIST, COIL and WebKB. We also make use of
fast numerical solvers for finding the eigenvectors and eigenvalues of the
graph Laplacian, and take advantage of the sparsity of the matrix. Experiments
indicate that the results are competitive with or better than the current
state-of-the-art multiclass segmentation algorithms.Comment: 14 page
Towards View-invariant and Accurate Loop Detection Based on Scene Graph
Loop detection plays a key role in visual Simultaneous Localization and
Mapping (SLAM) by correcting the accumulated pose drift. In indoor scenarios,
the richly distributed semantic landmarks are view-point invariant and hold
strong descriptive power in loop detection. The current semantic-aided loop
detection embeds the topology between semantic instances to search a loop.
However, current semantic-aided loop detection methods face challenges in
dealing with ambiguous semantic instances and drastic viewpoint differences,
which are not fully addressed in the literature. This paper introduces a novel
loop detection method based on an incrementally created scene graph, targeting
the visual SLAM at indoor scenes. It jointly considers the macro-view topology,
micro-view topology, and occupancy of semantic instances to find correct
correspondences. Experiments using handheld RGB-D sequence show our method is
able to accurately detect loops in drastically changed viewpoints. It maintains
a high precision in observing objects with similar topology and appearance. Our
method also demonstrates that it is robust in changed indoor scenes.Comment: Accepted by ICRA202
- …