365 research outputs found

    Recurrent Multiresolution Convolutional Networks for VHR Image Classification

    Get PDF
    Classification of very high resolution (VHR) satellite images has three major challenges: 1) inherent low intra-class and high inter-class spectral similarities, 2) mismatching resolution of available bands, and 3) the need to regularize noisy classification maps. Conventional methods have addressed these challenges by adopting separate stages of image fusion, feature extraction, and post-classification map regularization. These processing stages, however, are not jointly optimizing the classification task at hand. In this study, we propose a single-stage framework embedding the processing stages in a recurrent multiresolution convolutional network trained in an end-to-end manner. The feedforward version of the network, called FuseNet, aims to match the resolution of the panchromatic and multispectral bands in a VHR image using convolutional layers with corresponding downsampling and upsampling operations. Contextual label information is incorporated into FuseNet by means of a recurrent version called ReuseNet. We compared FuseNet and ReuseNet against the use of separate processing steps for both image fusion, e.g. pansharpening and resampling through interpolation, and map regularization such as conditional random fields. We carried out our experiments on a land cover classification task using a Worldview-03 image of Quezon City, Philippines and the ISPRS 2D semantic labeling benchmark dataset of Vaihingen, Germany. FuseNet and ReuseNet surpass the baseline approaches in both quantitative and qualitative results

    Convolutional Neural Network on Three Orthogonal Planes for Dynamic Texture Classification

    Get PDF
    Dynamic Textures (DTs) are sequences of images of moving scenes that exhibit certain stationarity properties in time such as smoke, vegetation and fire. The analysis of DT is important for recognition, segmentation, synthesis or retrieval for a range of applications including surveillance, medical imaging and remote sensing. Deep learning methods have shown impressive results and are now the new state of the art for a wide range of computer vision tasks including image and video recognition and segmentation. In particular, Convolutional Neural Networks (CNNs) have recently proven to be well suited for texture analysis with a design similar to a filter bank approach. In this paper, we develop a new approach to DT analysis based on a CNN method applied on three orthogonal planes x y , xt and y t . We train CNNs on spatial frames and temporal slices extracted from the DT sequences and combine their outputs to obtain a competitive DT classifier. Our results on a wide range of commonly used DT classification benchmark datasets prove the robustness of our approach. Significant improvement of the state of the art is shown on the larger datasets.Comment: 19 pages, 10 figure

    A Motor-Imagery BCI System Based on Deep Learning Networks and Its Applications

    Get PDF
    Motor imagery brain-computer interface (BCI) by using of deep-learning models is proposed in this paper. In which, we used the electroencephalogram (EEG) signals of motor imagery (MI-EEG) to identify different imagery activities. The brain dynamics of motor imagery are usually measured by EEG as non-stationary time series of low signal-to-noise ratio. However, a variety of methods have been previously developed to classify MI-EEG signals getting not satisfactory results owing to lack of characteristics in time-frequency features. In this paper, discrete wavelet transform (DWT) was applied to transform MIEEG signals and extract their effective coefficients as the time-frequency features. Then two deep learning (DL) models named Long-short term memory (LSTM) and gated recurrent neural networks (GRNN) are used to classify MI-EEG data. LSTM is designed to fight against vanishing gradients. GRNN makes each recurrent unit to capture dependencies of different time scales adaptively. Similar scheme of the LSTM unit, GRNN has gating units that modulate the flow of information inside the unit, but without having a separate memory cells. Experimental results show that GRNN and LSTM yield higher classification accuracies compared to the existing approaches that is helpful for the further research and application of relative RNN in processing of MI-EEG

    The wavelet-NARMAX representation : a hybrid model structure combining polynomial models with multiresolution wavelet decompositions

    Get PDF
    A new hybrid model structure combing polynomial models with multiresolution wavelet decompositions is introduced for nonlinear system identification. Polynomial models play an important role in approximation theory, and have been extensively used in linear and nonlinear system identification. Wavelet decompositions, in which the basis functions have the property of localization in both time and frequency, outperform many other approximation schemes and offer a flexible solution for approximating arbitrary functions. Although wavelet representations can approximate even severe nonlinearities in a given signal very well, the advantage of these representations can be lost when wavelets are used to capture linear or low-order nonlinear behaviour in a signal. In order to sufficiently utilise the global property of polynomials and the local property of wavelet representations simultaneously, in this study polynomial models and wavelet decompositions are combined together in a parallel structure to represent nonlinear input-output systems. As a special form of the NARMAX model, this hybrid model structure will be referred to as the WAvelet-NARMAX model, or simply WANARMAX. Generally, such a WANARMAX representation for an input-output system might involve a large number of basis functions and therefore a great number of model terms. Experience reveals that only a small number of these model terms are significant to the system output. A new fast orthogonal least squares algorithm, called the matching pursuit orthogonal least squares (MPOLS) algorithm, is also introduced in this study to determine which terms should be included in the final model

    Deep Hierarchical Super-Resolution for Scientific Data Reduction and Visualization

    Full text link
    We present an approach for hierarchical super resolution (SR) using neural networks on an octree data representation. We train a hierarchy of neural networks, each capable of 2x upscaling in each spatial dimension between two levels of detail, and use these networks in tandem to facilitate large scale factor super resolution, scaling with the number of trained networks. We utilize these networks in a hierarchical super resolution algorithm that upscales multiresolution data to a uniform high resolution without introducing seam artifacts on octree node boundaries. We evaluate application of this algorithm in a data reduction framework by dynamically downscaling input data to an octree-based data structure to represent the multiresolution data before compressing for additional storage reduction. We demonstrate that our approach avoids seam artifacts common to multiresolution data formats, and show how neural network super resolution assisted data reduction can preserve global features better than compressors alone at the same compression ratios

    TimeScaleNet : a Multiresolution Approach for Raw Audio Recognition using Learnable Biquadratic IIR Filters and Residual Networks of Depthwise-Separable One-Dimensional Atrous Convolutions

    Get PDF
    International audienceIn the present paper, we show the benefit of a multi-resolution approach that allows to encode the relevant information contained in unprocessed time domain acoustic signals. TimeScaleNet aims at learning an efficient representation of a sound, by learning time dependencies both at the sample level and at the frame level. The proposed approach allows to improve the interpretability of the learning scheme, by unifying advanced deep learning and signal processing techniques. In particular, TimeScaleNet's architecture introduces a new form of recurrent neural layer, which is directly inspired from digital IIR signal processing. This layer acts as a learnable passband biquadratic digital IIR filterbank. The learnable filterbank allows to build a time-frequency-like feature map that self-adapts to the specific recognition task and dataset, with a large receptive field and very few learnable parameters. The obtained frame-level feature map is then processed using a residual network of depthwise separable atrous convolutions. This second scale of analysis aims at efficiently encoding relationships between the time fluctuations at the frame timescale, in different learnt pooled frequency bands, in the range of [20 ms ; 200 ms]. TimeScaleNet is tested both using the Speech Commands Dataset and the ESC-10 Dataset. We report a very high mean accuracy of 94.87 ± 0.24% (macro averaged F1-score : 94.9 ± 0.24%) for speech recognition, and a rather moderate accuracy of 69.71 ± 1.91% (macro averaged F1-score : 70.14 ± 1.57%) for the environmental sound classification task
    corecore