2,004 research outputs found
Deep Complex Networks
At present, the vast majority of building blocks, techniques, and
architectures for deep learning are based on real-valued operations and
representations. However, recent work on recurrent neural networks and older
fundamental theoretical analysis suggests that complex numbers could have a
richer representational capacity and could also facilitate noise-robust memory
retrieval mechanisms. Despite their attractive properties and potential for
opening up entirely new neural architectures, complex-valued deep neural
networks have been marginalized due to the absence of the building blocks
required to design such models. In this work, we provide the key atomic
components for complex-valued deep neural networks and apply them to
convolutional feed-forward networks and convolutional LSTMs. More precisely, we
rely on complex convolutions and present algorithms for complex
batch-normalization, complex weight initialization strategies for
complex-valued neural nets and we use them in experiments with end-to-end
training schemes. We demonstrate that such complex-valued models are
competitive with their real-valued counterparts. We test deep complex models on
several computer vision tasks, on music transcription using the MusicNet
dataset and on Speech Spectrum Prediction using the TIMIT dataset. We achieve
state-of-the-art performance on these audio-related tasks
Feedback control by online learning an inverse model
A model, predictor, or error estimator is often used by a feedback controller to control a plant. Creating such a model is difficult when the plant exhibits nonlinear behavior. In this paper, a novel online learning control framework is proposed that does not require explicit knowledge about the plant. This framework uses two learning modules, one for creating an inverse model, and the other for actually controlling the plant. Except for their inputs, they are identical. The inverse model learns by the exploration performed by the not yet fully trained controller, while the actual controller is based on the currently learned model. The proposed framework allows fast online learning of an accurate controller. The controller can be applied on a broad range of tasks with different dynamic characteristics. We validate this claim by applying our control framework on several control tasks: 1) the heating tank problem (slow nonlinear dynamics); 2) flight pitch control (slow linear dynamics); and 3) the balancing problem of a double inverted pendulum (fast linear and nonlinear dynamics). The results of these experiments show that fast learning and accurate control can be achieved. Furthermore, a comparison is made with some classical control approaches, and observations concerning convergence and stability are made
The Power of Linear Recurrent Neural Networks
Recurrent neural networks are a powerful means to cope with time series. We
show how a type of linearly activated recurrent neural networks, which we call
predictive neural networks, can approximate any time-dependent function f(t)
given by a number of function values. The approximation can effectively be
learned by simply solving a linear equation system; no backpropagation or
similar methods are needed. Furthermore, the network size can be reduced by
taking only most relevant components. Thus, in contrast to others, our approach
not only learns network weights but also the network architecture. The networks
have interesting properties: They end up in ellipse trajectories in the long
run and allow the prediction of further values and compact representations of
functions. We demonstrate this by several experiments, among them multiple
superimposed oscillators (MSO), robotic soccer, and predicting stock prices.
Predictive neural networks outperform the previous state-of-the-art for the MSO
task with a minimal number of units.Comment: 22 pages, 14 figures and tables, revised implementatio
Hierarchical Temporal Representation in Linear Reservoir Computing
Recently, studies on deep Reservoir Computing (RC) highlighted the role of
layering in deep recurrent neural networks (RNNs). In this paper, the use of
linear recurrent units allows us to bring more evidence on the intrinsic
hierarchical temporal representation in deep RNNs through frequency analysis
applied to the state signals. The potentiality of our approach is assessed on
the class of Multiple Superimposed Oscillator tasks. Furthermore, our
investigation provides useful insights to open a discussion on the main aspects
that characterize the deep learning framework in the temporal domain.Comment: This is a pre-print of the paper submitted to the 27th Italian
Workshop on Neural Networks, WIRN 201
Cluster-based Input Weight Initialization for Echo State Networks
Echo State Networks (ESNs) are a special type of recurrent neural networks
(RNNs), in which the input and recurrent connections are traditionally
generated randomly, and only the output weights are trained. Despite the recent
success of ESNs in various tasks of audio, image and radar recognition, we
postulate that a purely random initialization is not the ideal way of
initializing ESNs. The aim of this work is to propose an unsupervised
initialization of the input connections using the K-Means algorithm on the
training data. We show that this initialization performs equivalently or
superior than a randomly initialized ESN whilst needing significantly less
reservoir neurons (2000 vs. 4000 for spoken digit recognition, and 300 vs. 8000
neurons for f0 extraction) and thus reducing the amount of training time.
Furthermore, we discuss that this approach provides the opportunity to estimate
the suitable size of the reservoir based on the prior knowledge about the data.Comment: Submitted to IEEE Transactions on Neural Network and Learning System
(TNNLS), 202
Echo State Networks: analysis, training and predictive control
The goal of this paper is to investigate the theoretical properties, the
training algorithm, and the predictive control applications of Echo State
Networks (ESNs), a particular kind of Recurrent Neural Networks. First, a
condition guaranteeing incremetal global asymptotic stability is devised. Then,
a modified training algorithm allowing for dimensionality reduction of ESNs is
presented. Eventually, a model predictive controller is designed to solve the
tracking problem, relying on ESNs as the model of the system. Numerical results
concerning the predictive control of a nonlinear process for pH neutralization
confirm the effectiveness of the proposed algorithms for the identification,
dimensionality reduction, and the control design for ESNs.Comment: 6 pages,5 figures, submitted to European Control Conference (ECC
- …