13,677 research outputs found

    Rethinking Full Connectivity in Recurrent Neural Networks

    Full text link
    Recurrent neural networks (RNNs) are omnipresent in sequence modeling tasks. Practical models usually consist of several layers of hundreds or thousands of neurons which are fully connected. This places a heavy computational and memory burden on hardware, restricting adoption in practical low-cost and low-power devices. Compared to fully convolutional models, the costly sequential operation of RNNs severely hinders performance on parallel hardware. This paper challenges the convention of full connectivity in RNNs. We study structurally sparse RNNs, showing that they are well suited for acceleration on parallel hardware, with a greatly reduced cost of the recurrent operations as well as orders of magnitude less recurrent weights. Extensive experiments on challenging tasks ranging from language modeling and speech recognition to video action recognition reveal that structurally sparse RNNs achieve competitive performance as compared to fully-connected networks. This allows for using large sparse RNNs for a wide range of real-world tasks that previously were too costly with fully connected networks

    A general-purpose deep learning approach to model time-varying audio effects

    Full text link
    Audio processors whose parameters are modified periodically over time are often referred as time-varying or modulation based audio effects. Most existing methods for modeling these type of effect units are often optimized to a very specific circuit and cannot be efficiently generalized to other time-varying effects. Based on convolutional and recurrent neural networks, we propose a deep learning architecture for generic black-box modeling of audio processors with long-term memory. We explore the capabilities of deep neural networks to learn such long temporal dependencies and we show the network modeling various linear and nonlinear, time-varying and time-invariant audio effects. In order to measure the performance of the model, we propose an objective metric based on the psychoacoustics of modulation frequency perception. We also analyze what the model is actually learning and how the given task is accomplished.Comment: audio files: https://mchijmma.github.io/modeling-time-varying

    Feed-forward approximations to dynamic recurrent network architectures

    Full text link
    Recurrent neural network architectures can have useful computational properties, with complex temporal dynamics and input-sensitive attractor states. However, evaluation of recurrent dynamic architectures requires solution of systems of differential equations, and the number of evaluations required to determine their response to a given input can vary with the input, or can be indeterminate altogether in the case of oscillations or instability. In feed-forward networks, by contrast, only a single pass through the network is needed to determine the response to a given input. Modern machine-learning systems are designed to operate efficiently on feed-forward architectures. We hypothesised that two-layer feedforward architectures with simple, deterministic dynamics could approximate the responses of single-layer recurrent network architectures. By identifying the fixed-point responses of a given recurrent network, we trained two-layer networks to directly approximate the fixed-point response to a given input. These feed-forward networks then embodied useful computations, including competitive interactions, information transformations and noise rejection. Our approach was able to find useful approximations to recurrent networks, which can then be evaluated in linear and deterministic time complexity.Comment: Author's final version, accepted for publication in Neural Computatio

    Variational online learning of neural dynamics

    Full text link
    New technologies for recording the activity of large neural populations during complex behavior provide exciting opportunities for investigating the neural computations that underlie perception, cognition, and decision-making. Nonlinear state space models provide an interpretable signal processing framework by combining an intuitive dynamical system with a probabilistic observation model, which can provide insights into neural dynamics, neural computation, and development of neural prosthetics and treatment through feedback control. It brings the challenge of learning both latent neural state and the underlying dynamical system because neither is known for neural systems a priori. We developed a flexible online learning framework for latent nonlinear state dynamics and filtered latent states. Using the stochastic gradient variational Bayes approach, our method jointly optimizes the parameters of the nonlinear dynamical system, the observation model, and the black-box recognition model. Unlike previous approaches, our framework can incorporate non-trivial distributions of observation noise and has constant time and space complexity. These features make our approach amenable to real-time applications and the potential to automate analysis and experimental design in ways that testably track and modify behavior using stimuli designed to influence learning.Comment: accepted by Frontiers in Computational Neuroscienc

    Multistep Speed Prediction on Traffic Networks: A Graph Convolutional Sequence-to-Sequence Learning Approach with Attention Mechanism

    Full text link
    Multistep traffic forecasting on road networks is a crucial task in successful intelligent transportation system applications. To capture the complex non-stationary temporal dynamics and spatial dependency in multistep traffic-condition prediction, we propose a novel deep learning framework named attention graph convolutional sequence-to-sequence model (AGC-Seq2Seq). In the proposed deep learning framework, spatial and temporal dependencies are modeled through the Seq2Seq model and graph convolution network separately, and the attention mechanism along with a newly designed training method based on the Seq2Seq architecture is proposed to overcome the difficulty in multistep prediction and further capture the temporal heterogeneity of traffic pattern. We conduct numerical tests to compare AGC-Seq2Seq with other benchmark models using a real-world dataset. The results indicate that our model yields the best prediction performance in terms of various prediction error measures. Furthermore, the variation of spatiotemporal correlation of traffic conditions under different perdition steps and road segments is revealed through sensitivity analyses

    NTIRE 2020 Challenge on Image and Video Deblurring

    Full text link
    Motion blur is one of the most common degradation artifacts in dynamic scene photography. This paper reviews the NTIRE 2020 Challenge on Image and Video Deblurring. In this challenge, we present the evaluation results from 3 competition tracks as well as the proposed solutions. Track 1 aims to develop single-image deblurring methods focusing on restoration quality. On Track 2, the image deblurring methods are executed on a mobile platform to find the balance of the running speed and the restoration accuracy. Track 3 targets developing video deblurring methods that exploit the temporal relation between input frames. In each competition, there were 163, 135, and 102 registered participants and in the final testing phase, 9, 4, and 7 teams competed. The winning methods demonstrate the state-ofthe-art performance on image and video deblurring tasks.Comment: To be published in CVPR 2020 Workshop (New Trends in Image Restoration and Enhancement

    Context-Aware Deep Spatio-Temporal Network for Hand Pose Estimation from Depth Images

    Full text link
    As a fundamental and challenging problem in computer vision, hand pose estimation aims to estimate the hand joint locations from depth images. Typically, the problem is modeled as learning a mapping function from images to hand joint coordinates in a data-driven manner. In this paper, we propose Context-Aware Deep Spatio-Temporal Network (CADSTN), a novel method to jointly model the spatio-temporal properties for hand pose estimation. Our proposed network is able to learn the representations of the spatial information and the temporal structure from the image sequences. Moreover, by adopting adaptive fusion method, the model is capable of dynamically weighting different predictions to lay emphasis on sufficient context. Our method is examined on two common benchmarks, the experimental results demonstrate that our proposed approach achieves the best or the second-best performance with state-of-the-art methods and runs in 60fps.Comment: IEEE Transactions On Cybernetic

    General Backpropagation Algorithm for Training Second-order Neural Networks

    Full text link
    The artificial neural network is a popular framework in machine learning. To empower individual neurons, we recently suggested that the current type of neurons could be upgraded to 2nd order counterparts, in which the linear operation between inputs to a neuron and the associated weights is replaced with a nonlinear quadratic operation. A single 2nd order neurons already has a strong nonlinear modeling ability, such as implementing basic fuzzy logic operations. In this paper, we develop a general backpropagation (BP) algorithm to train the network consisting of 2nd-order neurons. The numerical studies are performed to verify of the generalized BP algorithm.Comment: 5 pages, 7 figures, 19 reference

    Deep Learning on Traffic Prediction: Methods, Analysis and Future Directions

    Full text link
    Traffic prediction plays an essential role in intelligent transportation system. Accurate traffic prediction can assist route planing, guide vehicle dispatching, and mitigate traffic congestion. This problem is challenging due to the complicated and dynamic spatio-temporal dependencies between different regions in the road network. Recently, a significant amount of research efforts have been devoted to this area, especially deep learning method, greatly advancing traffic prediction abilities. The purpose of this paper is to provide a comprehensive survey on deep learning-based approaches in traffic prediction from multiple perspectives. Specifically, we first summarize the existing traffic prediction methods, and give a taxonomy. Second, we list the state-of-the-art approaches in different traffic prediction applications. Third, we comprehensively collect and organize widely used public datasets in the existing literature to facilitate other researchers. Furthermore, we give an evaluation and analysis by conducting extensive experiments to compare the performance of different methods on a real-world public dataset. Finally, we discuss open challenges in this field.Comment: to be published in IEEE Transactions on Intelligent Transportation System

    Modeling Rich Contexts for Sentiment Classification with LSTM

    Full text link
    Sentiment analysis on social media data such as tweets and weibo has become a very important and challenging task. Due to the intrinsic properties of such data, tweets are short, noisy, and of divergent topics, and sentiment classification on these data requires to modeling various contexts such as the retweet/reply history of a tweet, and the social context about authors and relationships. While few prior study has approached the issue of modeling contexts in tweet, this paper proposes to use a hierarchical LSTM to model rich contexts in tweet, particularly long-range context. Experimental results show that contexts can help us to perform sentiment classification remarkably better
    • …