2,144 research outputs found
Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting
In lifelong learning systems, especially those based on artificial neural
networks, one of the biggest obstacles is the severe inability to retain old
knowledge as new information is encountered. This phenomenon is known as
catastrophic forgetting. In this article, we propose a new kind of
connectionist architecture, the Sequential Neural Coding Network, that is
robust to forgetting when learning from streams of data points and, unlike
networks of today, does not learn via the immensely popular back-propagation of
errors. Grounded in the neurocognitive theory of predictive processing, our
model adapts its synapses in a biologically-plausible fashion, while another,
complementary neural system rapidly learns to direct and control this
cortex-like structure by mimicking the task-executive control functionality of
the basal ganglia. In our experiments, we demonstrate that our self-organizing
system experiences significantly less forgetting as compared to standard neural
models and outperforms a wide swath of previously proposed methods even though
it is trained across task datasets in a stream-like fashion. The promising
performance of our complementary system on benchmarks, e.g., SplitMNIST, Split
Fashion MNIST, and Split NotMNIST, offers evidence that by incorporating
mechanisms prominent in real neuronal systems, such as competition, sparse
activation patterns, and iterative input processing, a new possibility for
tackling the grand challenge of lifelong machine learning opens up.Comment: Key updates including results on standard benchmarks, e.g., split
mnist/fmnist/not-mnist. Task selection/basal ganglia model has been
integrate
Assessing the conservation value of waterbodies: the example of the Loire floodplain (France)
In recent decades, two of the main management tools used to stem biodiversity erosion have been biodiversity monitoring and the conservation of natural areas. However, socio-economic pressure means that it is not usually possible to preserve the entire landscape, and so the rational prioritisation of sites has become a crucial issue. In this context, and because floodplains are one of the most threatened ecosystems, we propose a statistical strategy for evaluating conservation value, and used it to prioritise 46 waterbodies in the Loire floodplain (France). We began by determining a synthetic conservation index of fish communities (Q) for each waterbody. This synthetic index includes a conservation status index, an origin index, a rarity index and a richness index. We divided the waterbodies into 6 clusters with distinct structures of the basic indices. One of these clusters, with high Q median value, indicated that 4 waterbodies are important for fish biodiversity conservation. Conversely, two clusters with low Q median values included 11 waterbodies where restoration is called for. The results picked out high connectivity levels and low abundance of aquatic vegetation as the two main environmental characteristics of waterbodies with high conservation value. In addition, assessing the biodiversity and conservation value of
territories using our multi-index approach plus an a posteriori hierarchical classification methodology reveals two major interests: (i) a possible geographical extension and (ii) a multi-taxa adaptation
From neural PCA to deep unsupervised learning
A network supporting deep unsupervised learning is presented. The network is
an autoencoder with lateral shortcut connections from the encoder to decoder at
each level of the hierarchy. The lateral shortcut connections allow the higher
levels of the hierarchy to focus on abstract invariant features. While standard
autoencoders are analogous to latent variable models with a single layer of
stochastic variables, the proposed network is analogous to hierarchical latent
variables models. Learning combines denoising autoencoder and denoising sources
separation frameworks. Each layer of the network contributes to the cost
function a term which measures the distance of the representations produced by
the encoder and the decoder. Since training signals originate from all levels
of the network, all layers can learn efficiently even in deep networks. The
speedup offered by cost terms from higher levels of the hierarchy and the
ability to learn invariant features are demonstrated in experiments.Comment: A revised version of an article that has been accepted for
publication in Advances in Independent Component Analysis and Learning
Machines (2015), edited by Ella Bingham, Samuel Kaski, Jorma Laaksonen and
Jouko Lampine
Semi-Supervised Speech Emotion Recognition with Ladder Networks
Speech emotion recognition (SER) systems find applications in various fields
such as healthcare, education, and security and defense. A major drawback of
these systems is their lack of generalization across different conditions. This
problem can be solved by training models on large amounts of labeled data from
the target domain, which is expensive and time-consuming. Another approach is
to increase the generalization of the models. An effective way to achieve this
goal is by regularizing the models through multitask learning (MTL), where
auxiliary tasks are learned along with the primary task. These methods often
require the use of labeled data which is computationally expensive to collect
for emotion recognition (gender, speaker identity, age or other emotional
descriptors). This study proposes the use of ladder networks for emotion
recognition, which utilizes an unsupervised auxiliary task. The primary task is
a regression problem to predict emotional attributes. The auxiliary task is the
reconstruction of intermediate feature representations using a denoising
autoencoder. This auxiliary task does not require labels so it is possible to
train the framework in a semi-supervised fashion with abundant unlabeled data
from the target domain. This study shows that the proposed approach creates a
powerful framework for SER, achieving superior performance than fully
supervised single-task learning (STL) and MTL baselines. The approach is
implemented with several acoustic features, showing that ladder networks
generalize significantly better in cross-corpus settings. Compared to the STL
baselines, the proposed approach achieves relative gains in concordance
correlation coefficient (CCC) between 3.0% and 3.5% for within corpus
evaluations, and between 16.1% and 74.1% for cross corpus evaluations,
highlighting the power of the architecture
An Analysis of the Connections Between Layers of Deep Neural Networks
We present an analysis of different techniques for selecting the connection
be- tween layers of deep neural networks. Traditional deep neural networks use
ran- dom connection tables between layers to keep the number of connections
small and tune to different image features. This kind of connection performs
adequately in supervised deep networks because their values are refined during
the training. On the other hand, in unsupervised learning, one cannot rely on
back-propagation techniques to learn the connections between layers. In this
work, we tested four different techniques for connecting the first layer of the
network to the second layer on the CIFAR and SVHN datasets and showed that the
accuracy can be im- proved up to 3% depending on the technique used. We also
showed that learning the connections based on the co-occurrences of the
features does not confer an advantage over a random connection table in small
networks. This work is helpful to improve the efficiency of connections between
the layers of unsupervised deep neural networks
Aerodynamic Parameters Estimation Using Radial Basis Function Neural Partial Differentiation Method
Aerodynamic parameter estimation involves modelling of force and moment coefficients and computation of stability and control derivatives from recorded flight data. This problem is extensively studied in the past using classical approaches such as output error, filter error and equation error methods. An alternative approach to these model based methods is the machine learning such as artificial neural network. In this paper, radial basis function neural network (RBF NN) is used to model the lateral-directional force and moment coefficients. The RBF NN is trained using k-means clustering algorithm for finding the centers of radial basis function and extended Kalman filter for obtaining the weights in the output layer. Then, a new method is proposed to obtain the stability and control derivatives. The first order partial differentiation is performed analytically on the radial basis function neural network approximated output. The stability and control derivatives are computed at each training data point, thus reducing the post training time and computational efforts compared to hitherto delta method and its variants. The efficacy of the identified model and proposed neural derivative method is demonstrated using real time flight data of ATTAS aircraft. The results from the proposed approach compare well with those from the other
- …