53,655 research outputs found

    Cyclic gate recurrent neural networks for time series data with missing values

    Get PDF
    Gated Recurrent Neural Networks (RNNs) such as LSTM and GRU have been highly effective in handling sequential time series data in recent years. Although Gated RNNs have an inherent ability to learn complex temporal dynamics, there is potential for further enhancement by enabling these deep learning networks to directly use time information to recognise time-dependent patterns in data and identify important segments of time. Synonymous with time series data in real-world applications are missing values, which often reduce a model’s ability to perform predictive tasks. Historically, missing values have been handled by simple or complex imputation techniques as well as machine learning models, which manage the missing values in the prediction layers. However, these methods do not attempt to identify the significance of data segments and therefore are susceptible to poor imputation values or model degradation from high missing value rates. This paper develops Cyclic Gate enhanced recurrent neural networks with learnt waveform parameters to automatically identify important data segments within a time series and neglect unimportant segments. By using the proposed networks, the negative impact of missing data on model performance is mitigated through the addition of customised cyclic opening and closing gate operations. Cyclic Gate Recurrent Neural Networks are tested on several sequential time series datasets for classification performance. For long sequence datasets with high rates of missing values, Cyclic Gate enhanced RNN models achieve higher performance metrics than standard gated recurrent neural network models, conventional non-neural network machine learning algorithms and current state of the art RNN cell variants

    Convolutional Drift Networks for Video Classification

    Full text link
    Analyzing spatio-temporal data like video is a challenging task that requires processing visual and temporal information effectively. Convolutional Neural Networks have shown promise as baseline fixed feature extractors through transfer learning, a technique that helps minimize the training cost on visual information. Temporal information is often handled using hand-crafted features or Recurrent Neural Networks, but this can be overly specific or prohibitively complex. Building a fully trainable system that can efficiently analyze spatio-temporal data without hand-crafted features or complex training is an open challenge. We present a new neural network architecture to address this challenge, the Convolutional Drift Network (CDN). Our CDN architecture combines the visual feature extraction power of deep Convolutional Neural Networks with the intrinsically efficient temporal processing provided by Reservoir Computing. In this introductory paper on the CDN, we provide a very simple baseline implementation tested on two egocentric (first-person) video activity datasets.We achieve video-level activity classification results on-par with state-of-the art methods. Notably, performance on this complex spatio-temporal task was produced by only training a single feed-forward layer in the CDN.Comment: Published in IEEE Rebooting Computin

    Convolutional RNN: an Enhanced Model for Extracting Features from Sequential Data

    Get PDF
    Traditional convolutional layers extract features from patches of data by applying a non-linearity on an affine function of the input. We propose a model that enhances this feature extraction process for the case of sequential data, by feeding patches of the data into a recurrent neural network and using the outputs or hidden states of the recurrent units to compute the extracted features. By doing so, we exploit the fact that a window containing a few frames of the sequential data is a sequence itself and this additional structure might encapsulate valuable information. In addition, we allow for more steps of computation in the feature extraction process, which is potentially beneficial as an affine function followed by a non-linearity can result in too simple features. Using our convolutional recurrent layers we obtain an improvement in performance in two audio classification tasks, compared to traditional convolutional layers. Tensorflow code for the convolutional recurrent layers is publicly available in https://github.com/cruvadom/Convolutional-RNN

    Deep Learning for Audio Signal Processing

    Full text link
    Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

    Temporal Attention-Gated Model for Robust Sequence Classification

    Full text link
    Typical techniques for sequence classification are designed for well-segmented sequences which have been edited to remove noisy or irrelevant parts. Therefore, such methods cannot be easily applied on noisy sequences expected in real-world applications. In this paper, we present the Temporal Attention-Gated Model (TAGM) which integrates ideas from attention models and gated recurrent networks to better deal with noisy or unsegmented sequences. Specifically, we extend the concept of attention model to measure the relevance of each observation (time step) of a sequence. We then use a novel gated recurrent network to learn the hidden representation for the final prediction. An important advantage of our approach is interpretability since the temporal attention weights provide a meaningful value for the salience of each time step in the sequence. We demonstrate the merits of our TAGM approach, both for prediction accuracy and interpretability, on three different tasks: spoken digit recognition, text-based sentiment analysis and visual event recognition.Comment: Accepted by CVPR 201
    • …
    corecore