13,677 research outputs found
Rethinking Full Connectivity in Recurrent Neural Networks
Recurrent neural networks (RNNs) are omnipresent in sequence modeling tasks.
Practical models usually consist of several layers of hundreds or thousands of
neurons which are fully connected. This places a heavy computational and memory
burden on hardware, restricting adoption in practical low-cost and low-power
devices. Compared to fully convolutional models, the costly sequential
operation of RNNs severely hinders performance on parallel hardware. This paper
challenges the convention of full connectivity in RNNs. We study structurally
sparse RNNs, showing that they are well suited for acceleration on parallel
hardware, with a greatly reduced cost of the recurrent operations as well as
orders of magnitude less recurrent weights. Extensive experiments on
challenging tasks ranging from language modeling and speech recognition to
video action recognition reveal that structurally sparse RNNs achieve
competitive performance as compared to fully-connected networks. This allows
for using large sparse RNNs for a wide range of real-world tasks that
previously were too costly with fully connected networks
A general-purpose deep learning approach to model time-varying audio effects
Audio processors whose parameters are modified periodically over time are
often referred as time-varying or modulation based audio effects. Most existing
methods for modeling these type of effect units are often optimized to a very
specific circuit and cannot be efficiently generalized to other time-varying
effects. Based on convolutional and recurrent neural networks, we propose a
deep learning architecture for generic black-box modeling of audio processors
with long-term memory. We explore the capabilities of deep neural networks to
learn such long temporal dependencies and we show the network modeling various
linear and nonlinear, time-varying and time-invariant audio effects. In order
to measure the performance of the model, we propose an objective metric based
on the psychoacoustics of modulation frequency perception. We also analyze what
the model is actually learning and how the given task is accomplished.Comment: audio files: https://mchijmma.github.io/modeling-time-varying
Feed-forward approximations to dynamic recurrent network architectures
Recurrent neural network architectures can have useful computational
properties, with complex temporal dynamics and input-sensitive attractor
states. However, evaluation of recurrent dynamic architectures requires
solution of systems of differential equations, and the number of evaluations
required to determine their response to a given input can vary with the input,
or can be indeterminate altogether in the case of oscillations or instability.
In feed-forward networks, by contrast, only a single pass through the network
is needed to determine the response to a given input. Modern machine-learning
systems are designed to operate efficiently on feed-forward architectures. We
hypothesised that two-layer feedforward architectures with simple,
deterministic dynamics could approximate the responses of single-layer
recurrent network architectures. By identifying the fixed-point responses of a
given recurrent network, we trained two-layer networks to directly approximate
the fixed-point response to a given input. These feed-forward networks then
embodied useful computations, including competitive interactions, information
transformations and noise rejection. Our approach was able to find useful
approximations to recurrent networks, which can then be evaluated in linear and
deterministic time complexity.Comment: Author's final version, accepted for publication in Neural
Computatio
Variational online learning of neural dynamics
New technologies for recording the activity of large neural populations
during complex behavior provide exciting opportunities for investigating the
neural computations that underlie perception, cognition, and decision-making.
Nonlinear state space models provide an interpretable signal processing
framework by combining an intuitive dynamical system with a probabilistic
observation model, which can provide insights into neural dynamics, neural
computation, and development of neural prosthetics and treatment through
feedback control. It brings the challenge of learning both latent neural state
and the underlying dynamical system because neither is known for neural systems
a priori. We developed a flexible online learning framework for latent
nonlinear state dynamics and filtered latent states. Using the stochastic
gradient variational Bayes approach, our method jointly optimizes the
parameters of the nonlinear dynamical system, the observation model, and the
black-box recognition model. Unlike previous approaches, our framework can
incorporate non-trivial distributions of observation noise and has constant
time and space complexity. These features make our approach amenable to
real-time applications and the potential to automate analysis and experimental
design in ways that testably track and modify behavior using stimuli designed
to influence learning.Comment: accepted by Frontiers in Computational Neuroscienc
Multistep Speed Prediction on Traffic Networks: A Graph Convolutional Sequence-to-Sequence Learning Approach with Attention Mechanism
Multistep traffic forecasting on road networks is a crucial task in
successful intelligent transportation system applications. To capture the
complex non-stationary temporal dynamics and spatial dependency in multistep
traffic-condition prediction, we propose a novel deep learning framework named
attention graph convolutional sequence-to-sequence model (AGC-Seq2Seq). In the
proposed deep learning framework, spatial and temporal dependencies are modeled
through the Seq2Seq model and graph convolution network separately, and the
attention mechanism along with a newly designed training method based on the
Seq2Seq architecture is proposed to overcome the difficulty in multistep
prediction and further capture the temporal heterogeneity of traffic pattern.
We conduct numerical tests to compare AGC-Seq2Seq with other benchmark models
using a real-world dataset. The results indicate that our model yields the best
prediction performance in terms of various prediction error measures.
Furthermore, the variation of spatiotemporal correlation of traffic conditions
under different perdition steps and road segments is revealed through
sensitivity analyses
NTIRE 2020 Challenge on Image and Video Deblurring
Motion blur is one of the most common degradation artifacts in dynamic scene
photography. This paper reviews the NTIRE 2020 Challenge on Image and Video
Deblurring. In this challenge, we present the evaluation results from 3
competition tracks as well as the proposed solutions. Track 1 aims to develop
single-image deblurring methods focusing on restoration quality. On Track 2,
the image deblurring methods are executed on a mobile platform to find the
balance of the running speed and the restoration accuracy. Track 3 targets
developing video deblurring methods that exploit the temporal relation between
input frames. In each competition, there were 163, 135, and 102 registered
participants and in the final testing phase, 9, 4, and 7 teams competed. The
winning methods demonstrate the state-ofthe-art performance on image and video
deblurring tasks.Comment: To be published in CVPR 2020 Workshop (New Trends in Image
Restoration and Enhancement
Context-Aware Deep Spatio-Temporal Network for Hand Pose Estimation from Depth Images
As a fundamental and challenging problem in computer vision, hand pose
estimation aims to estimate the hand joint locations from depth images.
Typically, the problem is modeled as learning a mapping function from images to
hand joint coordinates in a data-driven manner. In this paper, we propose
Context-Aware Deep Spatio-Temporal Network (CADSTN), a novel method to jointly
model the spatio-temporal properties for hand pose estimation. Our proposed
network is able to learn the representations of the spatial information and the
temporal structure from the image sequences. Moreover, by adopting adaptive
fusion method, the model is capable of dynamically weighting different
predictions to lay emphasis on sufficient context. Our method is examined on
two common benchmarks, the experimental results demonstrate that our proposed
approach achieves the best or the second-best performance with state-of-the-art
methods and runs in 60fps.Comment: IEEE Transactions On Cybernetic
General Backpropagation Algorithm for Training Second-order Neural Networks
The artificial neural network is a popular framework in machine learning. To
empower individual neurons, we recently suggested that the current type of
neurons could be upgraded to 2nd order counterparts, in which the linear
operation between inputs to a neuron and the associated weights is replaced
with a nonlinear quadratic operation. A single 2nd order neurons already has a
strong nonlinear modeling ability, such as implementing basic fuzzy logic
operations. In this paper, we develop a general backpropagation (BP) algorithm
to train the network consisting of 2nd-order neurons. The numerical studies are
performed to verify of the generalized BP algorithm.Comment: 5 pages, 7 figures, 19 reference
Deep Learning on Traffic Prediction: Methods, Analysis and Future Directions
Traffic prediction plays an essential role in intelligent transportation
system. Accurate traffic prediction can assist route planing, guide vehicle
dispatching, and mitigate traffic congestion. This problem is challenging due
to the complicated and dynamic spatio-temporal dependencies between different
regions in the road network. Recently, a significant amount of research efforts
have been devoted to this area, especially deep learning method, greatly
advancing traffic prediction abilities. The purpose of this paper is to provide
a comprehensive survey on deep learning-based approaches in traffic prediction
from multiple perspectives. Specifically, we first summarize the existing
traffic prediction methods, and give a taxonomy. Second, we list the
state-of-the-art approaches in different traffic prediction applications.
Third, we comprehensively collect and organize widely used public datasets in
the existing literature to facilitate other researchers. Furthermore, we give
an evaluation and analysis by conducting extensive experiments to compare the
performance of different methods on a real-world public dataset. Finally, we
discuss open challenges in this field.Comment: to be published in IEEE Transactions on Intelligent Transportation
System
Modeling Rich Contexts for Sentiment Classification with LSTM
Sentiment analysis on social media data such as tweets and weibo has become a
very important and challenging task. Due to the intrinsic properties of such
data, tweets are short, noisy, and of divergent topics, and sentiment
classification on these data requires to modeling various contexts such as the
retweet/reply history of a tweet, and the social context about authors and
relationships. While few prior study has approached the issue of modeling
contexts in tweet, this paper proposes to use a hierarchical LSTM to model rich
contexts in tweet, particularly long-range context. Experimental results show
that contexts can help us to perform sentiment classification remarkably
better
- …