9,560 research outputs found
Saliency-guided integration of multiple scans
we present a novel method..
Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
Unlike on images, semantic learning on 3D point clouds using a deep network
is challenging due to the naturally unordered data structure. Among existing
works, PointNet has achieved promising results by directly learning on point
sets. However, it does not take full advantage of a point's local neighborhood
that contains fine-grained structural information which turns out to be helpful
towards better semantic learning. In this regard, we present two new operations
to improve PointNet with a more efficient exploitation of local structures. The
first one focuses on local 3D geometric structures. In analogy to a convolution
kernel for images, we define a point-set kernel as a set of learnable 3D points
that jointly respond to a set of neighboring data points according to their
geometric affinities measured by kernel correlation, adapted from a similar
technique for point cloud registration. The second one exploits local
high-dimensional feature structures by recursive feature aggregation on a
nearest-neighbor-graph computed from 3D positions. Experiments show that our
network can efficiently capture local information and robustly achieve better
performances on major datasets. Our code is available at
http://www.merl.com/research/license#KCNetComment: Accepted in CVPR'18. *indicates equal contributio
Bags of Affine Subspaces for Robust Object Tracking
We propose an adaptive tracking algorithm where the object is modelled as a
continuously updated bag of affine subspaces, with each subspace constructed
from the object's appearance over several consecutive frames. In contrast to
linear subspaces, affine subspaces explicitly model the origin of subspaces.
Furthermore, instead of using a brittle point-to-subspace distance during the
search for the object in a new frame, we propose to use a subspace-to-subspace
distance by representing candidate image areas also as affine subspaces.
Distances between subspaces are then obtained by exploiting the non-Euclidean
geometry of Grassmann manifolds. Experiments on challenging videos (containing
object occlusions, deformations, as well as variations in pose and
illumination) indicate that the proposed method achieves higher tracking
accuracy than several recent discriminative trackers.Comment: in International Conference on Digital Image Computing: Techniques
and Applications, 201
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
On the Design and Analysis of Multiple View Descriptors
We propose an extension of popular descriptors based on gradient orientation
histograms (HOG, computed in a single image) to multiple views. It hinges on
interpreting HOG as a conditional density in the space of sampled images, where
the effects of nuisance factors such as viewpoint and illumination are
marginalized. However, such marginalization is performed with respect to a very
coarse approximation of the underlying distribution. Our extension leverages on
the fact that multiple views of the same scene allow separating intrinsic from
nuisance variability, and thus afford better marginalization of the latter. The
result is a descriptor that has the same complexity of single-view HOG, and can
be compared in the same manner, but exploits multiple views to better trade off
insensitivity to nuisance variability with specificity to intrinsic
variability. We also introduce a novel multi-view wide-baseline matching
dataset, consisting of a mixture of real and synthetic objects with ground
truthed camera motion and dense three-dimensional geometry
Coupled non-parametric shape and moment-based inter-shape pose priors for multiple basal ganglia structure segmentation
This paper presents a new active contour-based, statistical method for simultaneous volumetric segmentation of multiple subcortical structures in the brain. In biological tissues, such as the human brain, neighboring structures exhibit co-dependencies which can aid in segmentation, if properly analyzed and modeled. Motivated by this observation, we formulate the segmentation problem as a maximum a posteriori estimation problem, in which we incorporate statistical prior models on the shapes and inter-shape (relative) poses of the structures of interest. This provides a principled mechanism to bring high level information about the shapes and the relationships of anatomical structures into the segmentation problem. For learning the prior densities we use a nonparametric multivariate kernel density estimation framework. We combine these priors with data in a variational framework and develop an active contour-based iterative segmentation algorithm.
We test our method on the problem of volumetric segmentation of basal ganglia structures in magnetic resonance (MR) images.
We present a set of 2D and 3D experiments as well as a quantitative performance analysis. In addition, we perform a comparison to several existent segmentation methods and demonstrate the improvements provided by our approach in terms of segmentation accuracy
- âŠ