9,560 research outputs found

    Saliency-guided integration of multiple scans

    Get PDF
    we present a novel method..

    Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling

    Full text link
    Unlike on images, semantic learning on 3D point clouds using a deep network is challenging due to the naturally unordered data structure. Among existing works, PointNet has achieved promising results by directly learning on point sets. However, it does not take full advantage of a point's local neighborhood that contains fine-grained structural information which turns out to be helpful towards better semantic learning. In this regard, we present two new operations to improve PointNet with a more efficient exploitation of local structures. The first one focuses on local 3D geometric structures. In analogy to a convolution kernel for images, we define a point-set kernel as a set of learnable 3D points that jointly respond to a set of neighboring data points according to their geometric affinities measured by kernel correlation, adapted from a similar technique for point cloud registration. The second one exploits local high-dimensional feature structures by recursive feature aggregation on a nearest-neighbor-graph computed from 3D positions. Experiments show that our network can efficiently capture local information and robustly achieve better performances on major datasets. Our code is available at http://www.merl.com/research/license#KCNetComment: Accepted in CVPR'18. *indicates equal contributio

    Bags of Affine Subspaces for Robust Object Tracking

    Full text link
    We propose an adaptive tracking algorithm where the object is modelled as a continuously updated bag of affine subspaces, with each subspace constructed from the object's appearance over several consecutive frames. In contrast to linear subspaces, affine subspaces explicitly model the origin of subspaces. Furthermore, instead of using a brittle point-to-subspace distance during the search for the object in a new frame, we propose to use a subspace-to-subspace distance by representing candidate image areas also as affine subspaces. Distances between subspaces are then obtained by exploiting the non-Euclidean geometry of Grassmann manifolds. Experiments on challenging videos (containing object occlusions, deformations, as well as variations in pose and illumination) indicate that the proposed method achieves higher tracking accuracy than several recent discriminative trackers.Comment: in International Conference on Digital Image Computing: Techniques and Applications, 201

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    On the Design and Analysis of Multiple View Descriptors

    Full text link
    We propose an extension of popular descriptors based on gradient orientation histograms (HOG, computed in a single image) to multiple views. It hinges on interpreting HOG as a conditional density in the space of sampled images, where the effects of nuisance factors such as viewpoint and illumination are marginalized. However, such marginalization is performed with respect to a very coarse approximation of the underlying distribution. Our extension leverages on the fact that multiple views of the same scene allow separating intrinsic from nuisance variability, and thus afford better marginalization of the latter. The result is a descriptor that has the same complexity of single-view HOG, and can be compared in the same manner, but exploits multiple views to better trade off insensitivity to nuisance variability with specificity to intrinsic variability. We also introduce a novel multi-view wide-baseline matching dataset, consisting of a mixture of real and synthetic objects with ground truthed camera motion and dense three-dimensional geometry

    Coupled non-parametric shape and moment-based inter-shape pose priors for multiple basal ganglia structure segmentation

    Get PDF
    This paper presents a new active contour-based, statistical method for simultaneous volumetric segmentation of multiple subcortical structures in the brain. In biological tissues, such as the human brain, neighboring structures exhibit co-dependencies which can aid in segmentation, if properly analyzed and modeled. Motivated by this observation, we formulate the segmentation problem as a maximum a posteriori estimation problem, in which we incorporate statistical prior models on the shapes and inter-shape (relative) poses of the structures of interest. This provides a principled mechanism to bring high level information about the shapes and the relationships of anatomical structures into the segmentation problem. For learning the prior densities we use a nonparametric multivariate kernel density estimation framework. We combine these priors with data in a variational framework and develop an active contour-based iterative segmentation algorithm. We test our method on the problem of volumetric segmentation of basal ganglia structures in magnetic resonance (MR) images. We present a set of 2D and 3D experiments as well as a quantitative performance analysis. In addition, we perform a comparison to several existent segmentation methods and demonstrate the improvements provided by our approach in terms of segmentation accuracy
    • 

    corecore