Search CORE

365 research outputs found

Recurrent Multiresolution Convolutional Networks for VHR Image Classification

Author: Bergado John Ray
Persello Claudio
Stein Alfred
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/06/2018
Field of study

Classification of very high resolution (VHR) satellite images has three major challenges: 1) inherent low intra-class and high inter-class spectral similarities, 2) mismatching resolution of available bands, and 3) the need to regularize noisy classification maps. Conventional methods have addressed these challenges by adopting separate stages of image fusion, feature extraction, and post-classification map regularization. These processing stages, however, are not jointly optimizing the classification task at hand. In this study, we propose a single-stage framework embedding the processing stages in a recurrent multiresolution convolutional network trained in an end-to-end manner. The feedforward version of the network, called FuseNet, aims to match the resolution of the panchromatic and multispectral bands in a VHR image using convolutional layers with corresponding downsampling and upsampling operations. Contextual label information is incorporated into FuseNet by means of a recurrent version called ReuseNet. We compared FuseNet and ReuseNet against the use of separate processing steps for both image fusion, e.g. pansharpening and resampling through interpolation, and map regularization such as conditional random fields. We carried out our experiments on a land cover classification task using a Worldview-03 image of Quezon City, Philippines and the ISPRS 2D semantic labeling benchmark dataset of Vaihingen, Germany. FuseNet and ReuseNet surpass the baseline approaches in both quantitative and qualitative results

arXiv.org e-Print Archive

University of Twente Research Information

Convolutional Neural Network on Three Orthogonal Planes for Dynamic Texture Classification

Author: Andrearczyk Vincent
Whelan Paul F.
Publication venue
Publication date: 16/03/2017
Field of study

Dynamic Textures (DTs) are sequences of images of moving scenes that exhibit certain stationarity properties in time such as smoke, vegetation and fire. The analysis of DT is important for recognition, segmentation, synthesis or retrieval for a range of applications including surveillance, medical imaging and remote sensing. Deep learning methods have shown impressive results and are now the new state of the art for a wide range of computer vision tasks including image and video recognition and segmentation. In particular, Convolutional Neural Networks (CNNs) have recently proven to be well suited for texture analysis with a design similar to a filter bank approach. In this paper, we develop a new approach to DT analysis based on a CNN method applied on three orthogonal planes x y , xt and y t . We train CNNs on spatial frames and temporal slices extracted from the DT sequences and combine their outputs to obtain a competitive DT classifier. Our results on a wide range of commonly used DT classification benchmark datasets prove the robustness of our approach. Significant improvement of the state of the art is shown on the larger datasets.Comment: 19 pages, 10 figure

arXiv.org e-Print Archive

Irish Universities

DCU Online Research Access Service

A Motor-Imagery BCI System Based on Deep Learning Networks and Its Applications

Author: Lin Jzau-Sheng
Shihb Ray
Publication venue: 'IntechOpen'
Publication date: 17/10/2018
Field of study

Motor imagery brain-computer interface (BCI) by using of deep-learning models is proposed in this paper. In which, we used the electroencephalogram (EEG) signals of motor imagery (MI-EEG) to identify different imagery activities. The brain dynamics of motor imagery are usually measured by EEG as non-stationary time series of low signal-to-noise ratio. However, a variety of methods have been previously developed to classify MI-EEG signals getting not satisfactory results owing to lack of characteristics in time-frequency features. In this paper, discrete wavelet transform (DWT) was applied to transform MIEEG signals and extract their effective coefficients as the time-frequency features. Then two deep learning (DL) models named Long-short term memory (LSTM) and gated recurrent neural networks (GRNN) are used to classify MI-EEG data. LSTM is designed to fight against vanishing gradients. GRNN makes each recurrent unit to capture dependencies of different time scales adaptively. Similar scheme of the LSTM unit, GRNN has gating units that modulate the flow of information inside the unit, but without having a separate memory cells. Experimental results show that GRNN and LSTM yield higher classification accuracies compared to the existing approaches that is helpful for the further research and application of relative RNN in processing of MI-EEG

IntechOpen

Crossref

The wavelet-NARMAX representation : a hybrid model structure combining polynomial models with multiresolution wavelet decompositions

Author: Billings SA
Billings SA
Billings SA
Brown M
Campbell C
Chen S
Chen S
Chen S
Chen S
Chen S
Chen ZH
Chui CK
Daubechies I
Friedman JH
Friedman JH
Haykin S
Lee KL
Leontaritis IJ
Ljung L
Pearson RK
Schumaker LL
Wang LX
Wei HL
Wei HL
Zhang Q
Zhang Q
Publication venue: 'Informa UK Limited'
Publication date: 20/02/2005
Field of study

A new hybrid model structure combing polynomial models with multiresolution wavelet decompositions is introduced for nonlinear system identification. Polynomial models play an important role in approximation theory, and have been extensively used in linear and nonlinear system identification. Wavelet decompositions, in which the basis functions have the property of localization in both time and frequency, outperform many other approximation schemes and offer a flexible solution for approximating arbitrary functions. Although wavelet representations can approximate even severe nonlinearities in a given signal very well, the advantage of these representations can be lost when wavelets are used to capture linear or low-order nonlinear behaviour in a signal. In order to sufficiently utilise the global property of polynomials and the local property of wavelet representations simultaneously, in this study polynomial models and wavelet decompositions are combined together in a parallel structure to represent nonlinear input-output systems. As a special form of the NARMAX model, this hybrid model structure will be referred to as the WAvelet-NARMAX model, or simply WANARMAX. Generally, such a WANARMAX representation for an input-output system might involve a large number of basis functions and therefore a great number of model terms. Experience reveals that only a small number of these model terms are significant to the system output. A new fast orthogonal least squares algorithm, called the matching pursuit orthogonal least squares (MPOLS) algorithm, is also introduced in this study to determine which terms should be included in the final model

Crossref

White Rose Research Online

Deep Hierarchical Super-Resolution for Scientific Data Reduction and Visualization

Author: Guo Hanqi
Peterka Thomas
Raj Mukund
Shen Han-Wei
Wurster Skylar W.
Xu Jiayi
Publication venue
Publication date: 30/05/2021
Field of study

We present an approach for hierarchical super resolution (SR) using neural networks on an octree data representation. We train a hierarchy of neural networks, each capable of 2x upscaling in each spatial dimension between two levels of detail, and use these networks in tandem to facilitate large scale factor super resolution, scaling with the number of trained networks. We utilize these networks in a hierarchical super resolution algorithm that upscales multiresolution data to a uniform high resolution without introducing seam artifacts on octree node boundaries. We evaluate application of this algorithm in a data reduction framework by dynamically downscaling input data to an octree-based data structure to represent the multiresolution data before compressing for additional storage reduction. We demonstrate that our approach avoids seam artifacts common to multiresolution data formats, and show how neural network super resolution assisted data reduction can preserve global features better than compressors alone at the same compression ratios

arXiv.org e-Print Archive

TimeScaleNet : a Multiresolution Approach for Raw Audio Recognition using Learnable Biquadratic IIR Filters and Residual Networks of Depthwise-Separable One-Dimensional Atrous Convolutions

Author: Bavu Eric
Garcia Alexandre
Pujol Hadrien
Ramamonjy Aro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

International audienceIn the present paper, we show the benefit of a multi-resolution approach that allows to encode the relevant information contained in unprocessed time domain acoustic signals. TimeScaleNet aims at learning an efficient representation of a sound, by learning time dependencies both at the sample level and at the frame level. The proposed approach allows to improve the interpretability of the learning scheme, by unifying advanced deep learning and signal processing techniques. In particular, TimeScaleNet's architecture introduces a new form of recurrent neural layer, which is directly inspired from digital IIR signal processing. This layer acts as a learnable passband biquadratic digital IIR filterbank. The learnable filterbank allows to build a time-frequency-like feature map that self-adapts to the specific recognition task and dataset, with a large receptive field and very few learnable parameters. The obtained frame-level feature map is then processed using a residual network of depthwise separable atrous convolutions. This second scale of analysis aims at efficiently encoding relationships between the time fluctuations at the frame timescale, in different learnt pooled frequency bands, in the range of [20 ms ; 200 ms]. TimeScaleNet is tested both using the Speech Commands Dataset and the ESC-10 Dataset. We report a very high mean accuracy of 94.87 ± 0.24% (macro averaged F1-score : 94.9 ± 0.24%) for speech recognition, and a rather moderate accuracy of 69.71 ± 1.91% (macro averaged F1-score : 70.14 ± 1.57%) for the environmental sound classification task