14 research outputs found

    Joint & Progressive Learning from High-Dimensional Data for Multi-Label Classification

    Get PDF
    Despite the fact that nonlinear subspace learning techniques (e.g. manifold learning) have successfully applied to data representation, there is still room for improvement in explainability (explicit mapping), generalization (out-of-samples), and cost-effectiveness (linearization). To this end, a novel linearized subspace learning technique is developed in a joint and progressive way, called \textbf{j}oint and \textbf{p}rogressive \textbf{l}earning str\textbf{a}teg\textbf{y} (J-Play), with its application to multi-label classification. The J-Play learns high-level and semantically meaningful feature representation from high-dimensional data by 1) jointly performing multiple subspace learning and classification to find a latent subspace where samples are expected to be better classified; 2) progressively learning multi-coupled projections to linearly approach the optimal mapping bridging the original space with the most discriminative subspace; 3) locally embedding manifold structure in each learnable latent subspace. Extensive experiments are performed to demonstrate the superiority and effectiveness of the proposed method in comparison with previous state-of-the-art methods.Comment: accepted in ECCV 201

    Spatial-Spectral Manifold Embedding of Hyperspectral Data

    Get PDF
    In recent years, hyperspectral imaging, also known as imaging spectroscopy, has been paid an increasing interest in geoscience and remote sensing community. Hyperspectral imagery is characterized by very rich spectral information, which enables us to recognize the materials of interest lying on the surface of the Earth more easier. We have to admit, however, that high spectral dimension inevitably brings some drawbacks, such as expensive data storage and transmission, information redundancy, etc. Therefore, to reduce the spectral dimensionality effectively and learn more discriminative spectral low-dimensional embedding, in this paper we propose a novel hyperspectral embedding approach by simultaneously considering spatial and spectral information, called spatial-spectral manifold embedding (SSME). Beyond the pixel-wise spectral embedding approaches, SSME models the spatial and spectral information jointly in a patch-based fashion. SSME not only learns the spectral embedding by using the adjacency matrix obtained by similarity measurement between spectral signatures, but also models the spatial neighbours of a target pixel in hyperspectral scene by sharing the same weights (or edges) in the process of learning embedding. Classification is explored as a potential strategy to quantitatively evaluate the performance of learned embedding representations. Classification is explored as a potential application for quantitatively evaluating the performance of these hyperspectral embedding algorithms. Extensive experiments conducted on the widely-used hyperspectral datasets demonstrate the superiority and effectiveness of the proposed SSME as compared to several state-of-the-art embedding methods

    Spatial-Spectral Manifold Embedding of Hyperspectral Data

    Get PDF
    In recent years, hyperspectral imaging, also known as imaging spectroscopy, has been paid an increasing interest in geoscience and remote sensing community. Hyperspectral imagery is characterized by very rich spectral information, which enables us to recognize the materials of interest lying on the surface of the Earth more easier. We have to admit, however, that high spectral dimension inevitably brings some drawbacks, such as expensive data storage and transmission, information redundancy, etc. Therefore, to reduce the spectral dimensionality effectively and learn more discriminative spectral low-dimensional embedding, in this paper we propose a novel hyperspectral embedding approach by simultaneously considering spatial and spectral information, called spatialspectral manifold embedding (SSME). Beyond the pixel-wise spectral embedding approaches, SSME models the spatial and spectral information jointly in a patch-based fashion. SSME not only learns the spectral embedding by using the adjacency matrix obtained by similarity measurement between spectral signatures, but also models the spatial neighbours of a target pixel in hyperspectral scene by sharing the same weights (or edges) in the process of learning embedding. Classification is explored as a potential strategy to quantitatively evaluate the performance of learned embedding representations. Classification is explored as a potential application for quantitatively evaluating the performance of these hyperspectral embedding algorithms. Extensive experiments conducted on the widely-used hyperspectral datasets demonstrate the superiority and effectiveness of the proposed SSME as compared to several state-of-the-art embedding methods

    X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for Classification of Remote Sensing Data

    Get PDF
    This paper addresses the problem of semi-supervised transfer learning with limited cross-modality data in remote sensing. A large amount of multi-modal earth observation images, such as multispectral imagery (MSI) or synthetic aperture radar (SAR) data, are openly available on a global scale, enabling parsing global urban scenes through remote sensing imagery. However, their ability in identifying materials (pixel-wise classification) remains limited, due to the noisy collection environment and poor discriminative information as well as limited number of well-annotated training images. To this end, we propose a novel cross-modal deep-learning framework, called X-ModalNet, with three well-designed modules: self-adversarial module, interactive learning module, and label propagation module, by learning to transfer more discriminative information from a small-scale hyperspectral image (HSI) into the classification task using a large-scale MSI or SAR data. Significantly, X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network, yielding semi-supervised cross-modality learning. We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods

    Spectral Superresolution of Multispectral Imagery with Joint Sparse and Low-Rank Learning

    Full text link
    Extensive attention has been widely paid to enhance the spatial resolution of hyperspectral (HS) images with the aid of multispectral (MS) images in remote sensing. However, the ability in the fusion of HS and MS images remains to be improved, particularly in large-scale scenes, due to the limited acquisition of HS images. Alternatively, we super-resolve MS images in the spectral domain by the means of partially overlapped HS images, yielding a novel and promising topic: spectral superresolution (SSR) of MS imagery. This is challenging and less investigated task due to its high ill-posedness in inverse imaging. To this end, we develop a simple but effective method, called joint sparse and low-rank learning (J-SLoL), to spectrally enhance MS images by jointly learning low-rank HS-MS dictionary pairs from overlapped regions. J-SLoL infers and recovers the unknown hyperspectral signals over a larger coverage by sparse coding on the learned dictionary pair. Furthermore, we validate the SSR performance on three HS-MS datasets (two for classification and one for unmixing) in terms of reconstruction, classification, and unmixing by comparing with several existing state-of-the-art baselines, showing the effectiveness and superiority of the proposed J-SLoL algorithm. Furthermore, the codes and datasets will be available at: https://github.com/danfenghong/IEEE\_TGRS\_J-SLoL, contributing to the RS community

    More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification

    Full text link
    Classification and identification of the materials lying over or beneath the Earth's surface have long been a fundamental but challenging research topic in geoscience and remote sensing (RS) and have garnered a growing concern owing to the recent advancements of deep learning techniques. Although deep networks have been successfully applied in single-modality-dominated classification tasks, yet their performance inevitably meets the bottleneck in complex scenes that need to be finely classified, due to the limitation of information diversity. In this work, we provide a baseline solution to the aforementioned difficulty by developing a general multimodal deep learning (MDL) framework. In particular, we also investigate a special case of multi-modality learning (MML) -- cross-modality learning (CML) that exists widely in RS image classification applications. By focusing on "what", "where", and "how" to fuse, we show different fusion strategies as well as how to train deep networks and build the network architecture. Specifically, five fusion architectures are introduced and developed, further being unified in our MDL framework. More significantly, our framework is not only limited to pixel-wise classification tasks but also applicable to spatial information modeling with convolutional neural networks (CNNs). To validate the effectiveness and superiority of the MDL framework, extensive experiments related to the settings of MML and CML are conducted on two different multimodal RS datasets. Furthermore, the codes and datasets will be available at https://github.com/danfenghong/IEEE_TGRS_MDL-RS, contributing to the RS community

    Coupled Convolutional Neural Network with Adaptive Response Function Learning for Unsupervised Hyperspectral Super-Resolution

    Full text link
    Due to the limitations of hyperspectral imaging systems, hyperspectral imagery (HSI) often suffers from poor spatial resolution, thus hampering many applications of the imagery. Hyperspectral super-resolution refers to fusing HSI and MSI to generate an image with both high spatial and high spectral resolutions. Recently, several new methods have been proposed to solve this fusion problem, and most of these methods assume that the prior information of the Point Spread Function (PSF) and Spectral Response Function (SRF) are known. However, in practice, this information is often limited or unavailable. In this work, an unsupervised deep learning-based fusion method - HyCoNet - that can solve the problems in HSI-MSI fusion without the prior PSF and SRF information is proposed. HyCoNet consists of three coupled autoencoder nets in which the HSI and MSI are unmixed into endmembers and abundances based on the linear unmixing model. Two special convolutional layers are designed to act as a bridge that coordinates with the three autoencoder nets, and the PSF and SRF parameters are learned adaptively in the two convolution layers during the training process. Furthermore, driven by the joint loss function, the proposed method is straightforward and easily implemented in an end-to-end training manner. The experiments performed in the study demonstrate that the proposed method performs well and produces robust results for different datasets and arbitrary PSFs and SRFs

    Tensor singular spectral analysis for 3D feature extraction in hyperspectral images.

    Get PDF
    Due to the cubic structure of a hyperspectral image (HSI), how to characterize its spectral and spatial properties in three dimensions is challenging. Conventional spectral-spatial methods usually extract spectral and spatial information separately, ignoring their intrinsic correlations. Recently, some 3D feature extraction methods are developed for the extraction of spectral and spatial features simultaneously, although they rely on local spatial-spectral regions and thus ignore the global spectral similarity and spatial consistency. Meanwhile, some of these methods contain huge model parameters which require a large number of training samples. In this paper, a novel Tensor Singular Spectral Analysis (TensorSSA) method is proposed to extract global and low-rank features of HSI. In TensorSSA, an adaptive embedding operation is first proposed to construct a trajectory tensor corresponding to the entire HSI, which takes full advantage of the spatial similarity and improves the adequate representation of the global low-rank properties of the HSI. Moreover, the obtained trajectory tensor, which contains the global and local spatial and spectral information of the HSI, is decomposed by the Tensor singular value decomposition (t-SVD) to explore its low-rank intrinsic features. Finally, the efficacy of the extracted features is evaluated using the accuracy of image classification with a support vector machine (SVM) classifier. Experimental results on three publicly available datasets have fully demonstrated the superiority of the proposed TensorSSA over a few state-of-the-art 2D/3D feature extraction and deep learning algorithms, even with a limited number of training samples

    Graph Relation Network: Modeling Relations between Scenes for Multi-Label Remote Sensing Image Classification and Retrieval

    Get PDF
    Due to the proliferation of large-scale remote-sensing (RS) archives with multiple annotations, multilabel RS scene classification and retrieval are becoming increasingly popular. Although some recent deep learning-based methods are able to achieve promising results in this context, the lack of research on how to learn embedding spaces under the multilabel assumption often makes these models unable to preserve complex semantic relations pervading aerial scenes, which is an important limitation in RS applications. To fill this gap, we propose a new graph relation network (GRN) for multilabel RS scene categorization. Our GRN is able to model the relations between samples (or scenes) by making use of a graph structure which is fed into network learning. For this purpose, we define a new loss function called scalable neighbor discriminative loss with binary cross entropy (SNDL-BCE) that is able to embed the graph structures through the networks more effectively. The proposed approach can guide deep learning techniques (such as convolutional neural networks) to a more discriminative metric space, where semantically similar RS scenes are closely embedded and dissimilar images are separated from a novel multilabel viewpoint. To achieve this goal, our GRN jointly maximizes a weighted leave-one-out K -nearest neighbors ( K NN) score in the training set, where the weight matrix describes the contributions of the nearest neighbors associated with each RS image on its class decision, and the likelihood of the class discrimination in the multilabel scenario. An extensive experimental comparison, conducted on three multilabel RS scene data archives, validates the effectiveness of the proposed GRN in terms of K NN classification and image retrieval. The codes of this article will be made publicly available for reproducible research in the community
    corecore