98 research outputs found
Short-term Memory of Deep RNN
The extension of deep learning towards temporal data processing is gaining an
increasing research interest. In this paper we investigate the properties of
state dynamics developed in successive levels of deep recurrent neural networks
(RNNs) in terms of short-term memory abilities. Our results reveal interesting
insights that shed light on the nature of layering as a factor of RNN design.
Noticeably, higher layers in a hierarchically organized RNN architecture
results to be inherently biased towards longer memory spans even prior to
training of the recurrent connections. Moreover, in the context of Reservoir
Computing framework, our analysis also points out the benefit of a layered
recurrent organization as an efficient approach to improve the memory skills of
reservoir models.Comment: This is a pre-print (pre-review) version of the paper accepted for
presentation at the 26th European Symposium on Artificial Neural Networks,
Computational Intelligence and Machine Learning (ESANN), Bruges (Belgium),
25-27 April 201
Sparsity in Reservoir Computing Neural Networks
Reservoir Computing (RC) is a well-known strategy for designing Recurrent
Neural Networks featured by striking efficiency of training. The crucial aspect
of RC is to properly instantiate the hidden recurrent layer that serves as
dynamical memory to the system. In this respect, the common recipe is to create
a pool of randomly and sparsely connected recurrent neurons. While the aspect
of sparsity in the design of RC systems has been debated in the literature, it
is nowadays understood mainly as a way to enhance the efficiency of
computation, exploiting sparse matrix operations. In this paper, we empirically
investigate the role of sparsity in RC network design under the perspective of
the richness of the developed temporal representations. We analyze both
sparsity in the recurrent connections, and in the connections from the input to
the reservoir. Our results point out that sparsity, in particular in
input-reservoir connections, has a major role in developing internal temporal
representations that have a longer short-term memory of past inputs and a
higher dimension.Comment: This paper is currently under revie
Tree Edit Distance Learning via Adaptive Symbol Embeddings
Metric learning has the aim to improve classification accuracy by learning a
distance measure which brings data points from the same class closer together
and pushes data points from different classes further apart. Recent research
has demonstrated that metric learning approaches can also be applied to trees,
such as molecular structures, abstract syntax trees of computer programs, or
syntax trees of natural language, by learning the cost function of an edit
distance, i.e. the costs of replacing, deleting, or inserting nodes in a tree.
However, learning such costs directly may yield an edit distance which violates
metric axioms, is challenging to interpret, and may not generalize well. In
this contribution, we propose a novel metric learning approach for trees which
we call embedding edit distance learning (BEDL) and which learns an edit
distance indirectly by embedding the tree nodes as vectors, such that the
Euclidean distance between those vectors supports class discrimination. We
learn such embeddings by reducing the distance to prototypical trees from the
same class and increasing the distance to prototypical trees from different
classes. In our experiments, we show that BEDL improves upon the
state-of-the-art in metric learning for trees on six benchmark data sets,
ranging from computer science over biomedical data to a natural-language
processing data set containing over 300,000 nodes.Comment: Paper at the International Conference of Machine Learning (2018),
2018-07-10 to 2018-07-15 in Stockholm, Swede
Tree Echo State Networks
In this paper we present the Tree Echo State Network (TreeESN) model, generalizing the paradigm of Reservoir Computing to tree structured data. TreeESNs exploit an untrained generalized recursive reservoir, exhibiting extreme efficiency for learning in structured domains. In addition, we highlight through the paper other characteristics of the approach: First, we discuss the Markovian characterization of reservoir dynamics, extended to the case of tree domains, that is implied by the contractive setting of the TreeESN state transition function. Second, we study two types of state mapping functions to map the tree structured state of TreeESN into a fixed-size feature representation for classification or regression tasks. The critical role of the relation between the choice of the state mapping function and the Markovian characterization of the task is analyzed and experimentally investigated on both artificial and real-world tasks. Finally, experimental results on benchmark and real-world tasks show that the TreeESN approach, in spite of its efficiency, can achieve comparable results with state-of-the-art, although more complex, neural and kernel based models for tree structured data
Fast and Deep Graph Neural Networks
We address the efficiency issue for the construction of a deep graph neural
network (GNN). The approach exploits the idea of representing each input graph
as a fixed point of a dynamical system (implemented through a recurrent neural
network), and leverages a deep architectural organization of the recurrent
units. Efficiency is gained by many aspects, including the use of small and
very sparse networks, where the weights of the recurrent units are left
untrained under the stability condition introduced in this work. This can be
viewed as a way to study the intrinsic power of the architecture of a deep GNN,
and also to provide insights for the set-up of more complex fully-trained
models. Through experimental results, we show that even without training of the
recurrent connections, the architecture of small deep GNN is surprisingly able
to achieve or improve the state-of-the-art performance on a significant set of
tasks in the field of graphs classification.Comment: Pre-print of 'Fast and Deep Graph Neural Networks', accepted for AAAI
2020. This document includes the Supplementary Materia
Edge of stability echo state networks
Echo State Networks (ESNs) are time-series processing models working under
the Echo State Property (ESP) principle. The ESP is a notion of stability that
imposes an asymptotic fading of the memory of the input. On the other hand, the
resulting inherent architectural bias of ESNs may lead to an excessive loss of
information, which in turn harms the performance in certain tasks with long
short-term memory requirements. With the goal of bringing together the fading
memory property and the ability to retain as much memory as possible, in this
paper we introduce a new ESN architecture, called the Edge of Stability Echo
State Network (ESN). The introduced ESN model is based on defining the
reservoir layer as a convex combination of a nonlinear reservoir (as in the
standard ESN), and a linear reservoir that implements an orthogonal
transformation. We provide a thorough mathematical analysis of the introduced
model, proving that the whole eigenspectrum of the Jacobian of the ESN map
can be contained in an annular neighbourhood of a complex circle of
controllable radius, and exploit this property to demonstrate that the
ESN's forward dynamics evolves close to the edge-of-chaos regime by design.
Remarkably, our experimental analysis shows that the newly introduced reservoir
model is able to reach the theoretical maximum short-term memory capacity. At
the same time, in comparison to standard ESN, ESN is shown to offer an
excellent trade-off between memory and nonlinearity, as well as a significant
improvement of performance in autoregressive nonlinear modeling
Hierarchical Temporal Representation in Linear Reservoir Computing
Recently, studies on deep Reservoir Computing (RC) highlighted the role of
layering in deep recurrent neural networks (RNNs). In this paper, the use of
linear recurrent units allows us to bring more evidence on the intrinsic
hierarchical temporal representation in deep RNNs through frequency analysis
applied to the state signals. The potentiality of our approach is assessed on
the class of Multiple Superimposed Oscillator tasks. Furthermore, our
investigation provides useful insights to open a discussion on the main aspects
that characterize the deep learning framework in the temporal domain.Comment: This is a pre-print of the paper submitted to the 27th Italian
Workshop on Neural Networks, WIRN 201
Deep Echo State Networks for Diagnosis of Parkinson's Disease
In this paper, we introduce a novel approach for diagnosis of Parkinson's
Disease (PD) based on deep Echo State Networks (ESNs). The identification of PD
is performed by analyzing the whole time-series collected from a tablet device
during the sketching of spiral tests, without the need for feature extraction
and data preprocessing. We evaluated the proposed approach on a public dataset
of spiral tests. The results of experimental analysis show that DeepESNs
perform significantly better than shallow ESN model. Overall, the proposed
approach obtains state-of-the-art results in the identification of PD on this
kind of temporal data.Comment: This is a pre-print of the paper submitted to the European Symposium
on Artificial Neural Networks, Computational Intelligence and Machine
Learning, ESANN 201
- …