8,675 research outputs found

    First and Second-order Information Fusion Networks for Remote Sensing Scene Classification

    Get PDF
    Deep convolutional networks have been the most competitive method in remote sensing scene classification. Due to the diversity and complexity of scene content, remote sensing scene classification still remains a challenging task. Recently, the second-order pooling method has attracted more interest because it can learn higher-order information and enhance the non-linear modeling ability of the networks. However, how to effectively learn second-order features and establish the discriminative feature representation of holistic images is still an open question. In this Letter, we propose a first and second-order information fusion networks (FSoI-Net) that can learn the first-order and second-order features at the same time, and construct the final feature representation by fusing the two types of features. Specifically, a self-attention-based second-order pooling (SaSoP) method based on covariance matrix is proposed to extract second-order features, and a fusion loss function is developed to jointly train the model and construct the final feature representation for the classification decision. The proposed networks have been thoroughly evaluated on three real remote sensing scene datasets and achieved better performance than the counterparts

    An automatic classification method for LANDSAT data as resulting from different experiences in the Italian environment

    Get PDF
    There are no author-identified significant results in this report

    A Hybrid Approach with Multi-channel I-Vectors and Convolutional Neural Networks for Acoustic Scene Classification

    Full text link
    In Acoustic Scene Classification (ASC) two major approaches have been followed . While one utilizes engineered features such as mel-frequency-cepstral-coefficients (MFCCs), the other uses learned features that are the outcome of an optimization algorithm. I-vectors are the result of a modeling technique that usually takes engineered features as input. It has been shown that standard MFCCs extracted from monaural audio signals lead to i-vectors that exhibit poor performance, especially on indoor acoustic scenes. At the same time, Convolutional Neural Networks (CNNs) are well known for their ability to learn features by optimizing their filters. They have been applied on ASC and have shown promising results. In this paper, we first propose a novel multi-channel i-vector extraction and scoring scheme for ASC, improving their performance on indoor and outdoor scenes. Second, we propose a CNN architecture that achieves promising ASC results. Further, we show that i-vectors and CNNs capture complementary information from acoustic scenes. Finally, we propose a hybrid system for ASC using multi-channel i-vectors and CNNs by utilizing a score fusion technique. Using our method, we participated in the ASC task of the DCASE-2016 challenge. Our hybrid approach achieved 1 st rank among 49 submissions, substantially improving the previous state of the art

    A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification

    Full text link
    One of the biggest challenges of acoustic scene classification (ASC) is to find proper features to better represent and characterize environmental sounds. Environmental sounds generally involve more sound sources while exhibiting less structure in temporal spectral representations. However, the background of an acoustic scene exhibits temporal homogeneity in acoustic properties, suggesting it could be characterized by distribution statistics rather than temporal details. In this work, we investigated using auditory summary statistics as the feature for ASC tasks. The inspiration comes from a recent neuroscience study, which shows the human auditory system tends to perceive sound textures through time-averaged statistics. Based on these statistics, we further proposed to use linear discriminant analysis to eliminate redundancies among these statistics while keeping the discriminative information, providing an extreme com-pact representation for acoustic scenes. Experimental results show the outstanding performance of the proposed feature over the conventional handcrafted features.Comment: Accepted as a conference paper of Interspeech 201
    • …
    corecore