14,356 research outputs found
SEGCloud: Semantic Segmentation of 3D Point Clouds
3D semantic scene labeling is fundamental to agents operating in the real
world. In particular, labeling raw 3D point sets from sensors provides
fine-grained semantics. Recent works leverage the capabilities of Neural
Networks (NNs), but are limited to coarse voxel predictions and do not
explicitly enforce global consistency. We present SEGCloud, an end-to-end
framework to obtain 3D point-level segmentation that combines the advantages of
NNs, trilinear interpolation(TI) and fully connected Conditional Random Fields
(FC-CRF). Coarse voxel predictions from a 3D Fully Convolutional NN are
transferred back to the raw 3D points via trilinear interpolation. Then the
FC-CRF enforces global consistency and provides fine-grained semantics on the
points. We implement the latter as a differentiable Recurrent NN to allow joint
optimization. We evaluate the framework on two indoor and two outdoor 3D
datasets (NYU V2, S3DIS, KITTI, Semantic3D.net), and show performance
comparable or superior to the state-of-the-art on all datasets.Comment: Accepted as a spotlight at the International Conference of 3D Vision
(3DV 2017
Learning Deep Similarity Metric for 3D MR-TRUS Registration
Purpose: The fusion of transrectal ultrasound (TRUS) and magnetic resonance
(MR) images for guiding targeted prostate biopsy has significantly improved the
biopsy yield of aggressive cancers. A key component of MR-TRUS fusion is image
registration. However, it is very challenging to obtain a robust automatic
MR-TRUS registration due to the large appearance difference between the two
imaging modalities. The work presented in this paper aims to tackle this
problem by addressing two challenges: (i) the definition of a suitable
similarity metric and (ii) the determination of a suitable optimization
strategy.
Methods: This work proposes the use of a deep convolutional neural network to
learn a similarity metric for MR-TRUS registration. We also use a composite
optimization strategy that explores the solution space in order to search for a
suitable initialization for the second-order optimization of the learned
metric. Further, a multi-pass approach is used in order to smooth the metric
for optimization.
Results: The learned similarity metric outperforms the classical mutual
information and also the state-of-the-art MIND feature based methods. The
results indicate that the overall registration framework has a large capture
range. The proposed deep similarity metric based approach obtained a mean TRE
of 3.86mm (with an initial TRE of 16mm) for this challenging problem.
Conclusion: A similarity metric that is learned using a deep neural network
can be used to assess the quality of any given image registration and can be
used in conjunction with the aforementioned optimization framework to perform
automatic registration that is robust to poor initialization.Comment: To appear on IJCAR
Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval
Free-hand sketch-based image retrieval (SBIR) is a specific cross-view
retrieval task, in which queries are abstract and ambiguous sketches while the
retrieval database is formed with natural images. Work in this area mainly
focuses on extracting representative and shared features for sketches and
natural images. However, these can neither cope well with the geometric
distortion between sketches and images nor be feasible for large-scale SBIR due
to the heavy continuous-valued distance computation. In this paper, we speed up
SBIR by introducing a novel binary coding method, named \textbf{Deep Sketch
Hashing} (DSH), where a semi-heterogeneous deep architecture is proposed and
incorporated into an end-to-end binary coding framework. Specifically, three
convolutional neural networks are utilized to encode free-hand sketches,
natural images and, especially, the auxiliary sketch-tokens which are adopted
as bridges to mitigate the sketch-image geometric distortion. The learned DSH
codes can effectively capture the cross-view similarities as well as the
intrinsic semantic correlations between different categories. To the best of
our knowledge, DSH is the first hashing work specifically designed for
category-level SBIR with an end-to-end deep architecture. The proposed DSH is
comprehensively evaluated on two large-scale datasets of TU-Berlin Extension
and Sketchy, and the experiments consistently show DSH's superior SBIR
accuracies over several state-of-the-art methods, while achieving significantly
reduced retrieval time and memory footprint.Comment: This paper will appear as a spotlight paper in CVPR201
- …