129,739 research outputs found

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Automatic learning of gait signatures for people identification

    Get PDF
    This work targets people identification in video based on the way they walk (i.e. gait). While classical methods typically derive gait signatures from sequences of binary silhouettes, in this work we explore the use of convolutional neural networks (CNN) for learning high-level descriptors from low-level motion features (i.e. optical flow components). We carry out a thorough experimental evaluation of the proposed CNN architecture on the challenging TUM-GAID dataset. The experimental results indicate that using spatio-temporal cuboids of optical flow as input data for CNN allows to obtain state-of-the-art results on the gait task with an image resolution eight times lower than the previously reported results (i.e. 80x60 pixels).Comment: Proof of concept paper. Technical report on the use of ConvNets (CNN) for gait recognition. Data and code: http://www.uco.es/~in1majim/research/cnngaitof.htm

    A Discriminative Representation of Convolutional Features for Indoor Scene Recognition

    Full text link
    Indoor scene recognition is a multi-faceted and challenging problem due to the diverse intra-class variations and the confusing inter-class similarities. This paper presents a novel approach which exploits rich mid-level convolutional features to categorize indoor scenes. Traditionally used convolutional features preserve the global spatial structure, which is a desirable property for general object recognition. However, we argue that this structuredness is not much helpful when we have large variations in scene layouts, e.g., in indoor scenes. We propose to transform the structured convolutional activations to another highly discriminative feature space. The representation in the transformed space not only incorporates the discriminative aspects of the target dataset, but it also encodes the features in terms of the general object categories that are present in indoor scenes. To this end, we introduce a new large-scale dataset of 1300 object categories which are commonly present in indoor scenes. Our proposed approach achieves a significant performance boost over previous state of the art approaches on five major scene classification datasets

    Face Identification and Clustering

    Full text link
    In this thesis, we study two problems based on clustering algorithms. In the first problem, we study the role of visual attributes using an agglomerative clustering algorithm to whittle down the search area where the number of classes is high to improve the performance of clustering. We observe that as we add more attributes, the clustering performance increases overall. In the second problem, we study the role of clustering in aggregating templates in a 1:N open set protocol using multi-shot video as a probe. We observe that by increasing the number of clusters, the performance increases with respect to the baseline and reaches a peak, after which increasing the number of clusters causes the performance to degrade. Experiments are conducted using recently introduced unconstrained IARPA Janus IJB-A, CS2, and CS3 face recognition datasets
    corecore