4,097 research outputs found
Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform
Recurrent neural networks (RNNs) have been successfully used on a wide range
of sequential data problems. A well known difficulty in using RNNs is the
\textit{vanishing or exploding gradient} problem. Recently, there have been
several different RNN architectures that try to mitigate this issue by
maintaining an orthogonal or unitary recurrent weight matrix. One such
architecture is the scaled Cayley orthogonal recurrent neural network (scoRNN)
which parameterizes the orthogonal recurrent weight matrix through a scaled
Cayley transform. This parametrization contains a diagonal scaling matrix
consisting of positive or negative one entries that can not be optimized by
gradient descent. Thus the scaling matrix is fixed before training and a
hyperparameter is introduced to tune the matrix for each particular task. In
this paper, we develop a unitary RNN architecture based on a complex scaled
Cayley transform. Unlike the real orthogonal case, the transformation uses a
diagonal scaling matrix consisting of entries on the complex unit circle which
can be optimized using gradient descent and no longer requires the tuning of a
hyperparameter. We also provide an analysis of a potential issue of the modReLU
activiation function which is used in our work and several other unitary RNNs.
In the experiments conducted, the scaled Cayley unitary recurrent neural
network (scuRNN) achieves comparable or better results than scoRNN and other
unitary RNNs without fixing the scaling matrix
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics
A recent strategy to circumvent the exploding and vanishing gradient problem
in RNNs, and to allow the stable propagation of signals over long time scales,
is to constrain recurrent connectivity matrices to be orthogonal or unitary.
This ensures eigenvalues with unit norm and thus stable dynamics and training.
However this comes at the cost of reduced expressivity due to the limited
variety of orthogonal transformations. We propose a novel connectivity
structure based on the Schur decomposition and a splitting of the Schur form
into normal and non-normal parts. This allows to parametrize matrices with
unit-norm eigenspectra without orthogonality constraints on eigenbases. The
resulting architecture ensures access to a larger space of spectrally
constrained matrices, of which orthogonal matrices are a subset. This crucial
difference retains the stability advantages and training speed of orthogonal
RNNs while enhancing expressivity, especially on tasks that require
computations over ongoing input sequences
CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral Filters
The rise of graph-structured data such as social networks, regulatory
networks, citation graphs, and functional brain networks, in combination with
resounding success of deep learning in various applications, has brought the
interest in generalizing deep learning models to non-Euclidean domains. In this
paper, we introduce a new spectral domain convolutional architecture for deep
learning on graphs. The core ingredient of our model is a new class of
parametric rational complex functions (Cayley polynomials) allowing to
efficiently compute spectral filters on graphs that specialize on frequency
bands of interest. Our model generates rich spectral filters that are localized
in space, scales linearly with the size of the input data for
sparsely-connected graphs, and can handle different constructions of Laplacian
operators. Extensive experimental results show the superior performance of our
approach, in comparison to other spectral domain convolutional architectures,
on spectral image classification, community detection, vertex classification
and matrix completion tasks
Learning Unitary Operators with Help From u(n)
A major challenge in the training of recurrent neural networks is the
so-called vanishing or exploding gradient problem. The use of a norm-preserving
transition operator can address this issue, but parametrization is challenging.
In this work we focus on unitary operators and describe a parametrization using
the Lie algebra associated with the Lie group of unitary matrices. The exponential map provides a correspondence
between these spaces, and allows us to define a unitary matrix using real
coefficients relative to a basis of the Lie algebra. The parametrization is
closed under additive updates of these coefficients, and thus provides a simple
space in which to do gradient descent. We demonstrate the effectiveness of this
parametrization on the problem of learning arbitrary unitary operators,
comparing to several baselines and outperforming a recently-proposed
lower-dimensional parametrization. We additionally use our parametrization to
generalize a recently-proposed unitary recurrent neural network to arbitrary
unitary matrices, using it to solve standard long-memory tasks.Comment: 9 pages, 3 figures, 5 figures inc. subfigures, to appear at AAAI-1
- …