105 research outputs found

    Exemplar-supported representation for effective class-incremental learning

    Get PDF
    Catastrophic forgetting is a key challenge for class-incremental learning with deep neural networks, where the performance decreases considerably while dealing with long sequences of new classes. To tackle this issue, in this paper, we propose a new exemplar-supported representation for incremental learning (ESRIL) approach that consists of three components. First, we use memory aware synapses (MAS) pre-trained on the ImageNet to retain the ability of robust representation learning and classification for old classes from the perspective of the model. Second, exemplar-based subspace clustering (ESC) is utilized to construct the exemplar set, which can keep the performance from various views of the data. Third, the nearest class multiple centroids (NCMC) is used as the classifier to save the training cost of the fully connected layer of MAS when the criterion is met. Intensive experiments and analyses are presented to show the influence of various backbone structures and the effectiveness of different components in our model. Experiments on several general-purpose and fine-grained image recognition datasets have fully demonstrated the efficacy of the proposed methodology

    Effective melanoma recognition using deep convolutional neural network with covariance discriminant loss.

    Get PDF
    Melanoma recognition is challenging due to data imbalance and high intra-class variations and large inter-class similarity. Aiming at the issues, we propose a melanoma recognition method using deep convolutional neural network with covariance discriminant loss in dermoscopy images. Deep convolutional neural network is trained under the joint supervision of cross entropy loss and covariance discriminant loss, rectifying the model outputs and the extracted features simultaneously. Specifically, we design an embedding loss, namely covariance discriminant loss, which takes the first and second distance into account simultaneously for providing more constraints. By constraining the distance between hard samples and minority class center, the deep features of melanoma and non-melanoma can be separated effectively. To mine the hard samples, we also design the corresponding algorithm. Further, we analyze the relationship between the proposed loss and other losses. On the International Symposium on Biomedical Imaging (ISBI) 2018 Skin Lesion Analysis dataset, the two schemes in the proposed method can yield a sensitivity of 0.942 and 0.917, respectively. The comprehensive results have demonstrated the efficacy of the designed embedding loss and the proposed methodology

    Semisupervised hypergraph discriminant learning for dimensionality reduction of hyperspectral image.

    Get PDF
    Semisupervised learning is an effective technique to represent the intrinsic features of a hyperspectral image (HSI), which can reduce the cost to obtain the labeled information of samples. However, traditional semisupervised learning methods fail to consider multiple properties of an HSI, which has restricted the discriminant performance of feature representation. In this article, we introduce the hypergraph into semisupervised learning to reveal the complex multistructures of an HSI, and construct a semisupervised discriminant hypergraph learning (SSDHL) method by designing an intraclass hypergraph and an interclass graph with the labeled samples. SSDHL constructs an unsupervised hypergraph with the unlabeled samples. In addition, a total scatter matrix is used to measure the distribution of the labeled and unlabeled samples. Then, a low-dimensional projection function is constructed to compact the properties of the intraclass hypergraph and the unsupervised hypergraph, and simultaneously separate the characteristics of the interclass graph and the total scatter matrix. Finally, according to the objective function, we can obtain the projection matrix and the low-dimensional features. Experiments on three HSI data sets (Botswana, KSC, and PaviaU) show that the proposed method can achieve better classification results compared with a few state-of-the-art methods. The result indicates that SSDHL can simultaneously utilize the labeled and unlabeled samples to represent the homogeneous properties and restrain the heterogeneous characteristics of an HSI

    Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning

    Get PDF
    The abundant spatial and contextual information provided by the advanced remote sensing technology has facilitated subsequent automatic interpretation of the optical remote sensing images (RSIs). In this paper, a novel and effective geospatial object detection framework is proposed by combining the weakly supervised learning (WSL) and high-level feature learning. First, deep Boltzmann machine is adopted to infer the spatial and structural information encoded in the low-level and middle-level features to effectively describe objects in optical RSIs. Then, a novel WSL approach is presented to object detection where the training sets require only binary labels indicating whether an image contains the target object or not. Based on the learnt high-level features, it jointly integrates saliency, intraclass compactness, and interclass separability in a Bayesian framework to initialize a set of training examples from weakly labeled images and start iterative learning of the object detector. A novel evaluation criterion is also developed to detect model drift and cease the iterative learning. Comprehensive experiments on three optical RSI data sets have demonstrated the efficacy of the proposed approach in benchmarking with several state-of-the-art supervised-learning-based object detection approaches

    Urban PM2.5 concentration prediction via attention-based CNN–LSTM.

    Get PDF
    Urban particulate matter forecasting is regarded as an essential issue for early warning and control management of air pollution, especially fine particulate matter (PM2.5). However, existing methods for PM2.5 concentration prediction neglect the effects of featured states at different times in the past on future PM2.5 concentration, and most fail to effectively simulate the temporal and spatial dependencies of PM2.5 concentration at the same time. With this consideration, we propose a deep learning-based method, AC-LSTM, which comprises a one-dimensional convolutional neural network (CNN), long short-term memory (LSTM) network, and attention-based network, for urban PM2.5 concentration prediction. Instead of only using air pollutant concentrations, we also add meteorological data and the PM2.5 concentrations of adjacent air quality monitoring stations as the input to our AC-LSTM. Hence, the spatiotemporal correlation and interdependence of multivariate air quality-related time-series data are learned by the CNN-LSTM network in AC-LSTM. The attention mechanism is applied to capture the importance degrees of the effects of featured states at different times in the past on future PM2.5 concentration. The attention-based layer can automatically weigh the past feature states to improve prediction accuracy. In addition, we predict the PM2.5 concentrations over the next 24 h by using air quality data in Taiyuan city, China, and compare it with six baseline methods. To compare the overall performance of each method, the mean absolute error (MAE), root-mean-square error (RMSE), and coecient of determination (R2) are applied to the experiments in this paper. The experimental results indicate that our method is capable of dealing with PM2.5 concentration prediction with the highest performance

    Effective and efficient midlevel visual elements-oriented land-use classification using VHR remote sensing images

    Get PDF
    Land-use classification using remote sensing images covers a wide range of applications. With more detailed spatial and textural information provided in very high resolution (VHR) remote sensing images, a greater range of objects and spatial patterns can be observed than ever before. This offers us a new opportunity for advancing the performance of land-use classification. In this paper, we first introduce an effective midlevel visual elements-oriented land-use classification method based on “partlets,” which are a library of pretrained part detectors used for midlevel visual elements discovery. Taking advantage of midlevel visual elements rather than low-level image features, a partlets-based method represents images by computing their responses to a large number of part detectors. As the number of part detectors grows, a main obstacle to the broader application of this method is its computational cost. To address this problem, we next propose a novel framework to train coarse-to-fine shared intermediate representations, which are termed “sparselets,” from a large number of pretrained part detectors. This is achieved by building a single-hidden-layer autoencoder and a single-hidden-layer neural network with an L0-norm sparsity constraint, respectively. Comprehensive evaluations on a publicly available 21-class VHR land-use data set and comparisons with state-of-the-art approaches demonstrate the effectiveness and superiority of this paper

    Background prior-based salient object detection via deep reconstruction residual

    Get PDF
    Detection of salient objects from images is gaining increasing research interest in recent years as it can substantially facilitate a wide range of content-based multimedia applications. Based on the assumption that foreground salient regions are distinctive within a certain context, most conventional approaches rely on a number of hand designed features and their distinctiveness measured using local or global contrast. Although these approaches have shown effective in dealing with simple images, their limited capability may cause difficulties when dealing with more complicated images. This paper proposes a novel framework for saliency detection by first modeling the background and then separating salient objects from the background. We develop stacked denoising autoencoders with deep learning architectures to model the background where latent patterns are explored and more powerful representations of data are learnt in an unsupervised and bottom up manner. Afterwards, we formulate the separation of salient objects from the background as a problem of measuring reconstruction residuals of deep autoencoders. Comprehensive evaluations on three benchmark datasets and comparisons with 9 state-of-the-art algorithms demonstrate the superiority of the proposed work

    Multi-scale diff-changed feature fusion network for hyperspectral image change detection.

    Get PDF
    For hyperspectral images (HSI) change detection (CD), multi-scale features are usually used to construct the detection models. However, the existing studies only consider the multi-scale features containing changed and unchanged components, which is difficult to represent the subtle changes between bi-temporal HSIs in each scale. To address this problem, we propose a multi-scale diff-changed feature fusion network (MSDFFN) for HSI CD, which improves the ability of feature representation by learning the refined change components between bi-temporal HSIs under different scales. In this network, a temporal feature encoder-decoder sub-network, which combines a reduced inception module and a cross-layer attention module to highlight the significant features, is designed to extract the temporal features of HSIs. A bidirectional diff-changed feature representation module is proposed to learn the fine changed features of bi-temporal HSIs at various scales to enhance the discriminative performance of the subtle change. A multi-scale attention fusion module is developed to adaptively fuse the changed features of various scales. The proposed method can not only discover the subtle change of bi-temporal HSIs but also improve the discriminating power for HSI CD. Experimental results on three HSI datasets show that MSDFFN outperforms a few state-of-the-art methods
    corecore