2,144 research outputs found

    Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

    Full text link
    In lifelong learning systems, especially those based on artificial neural networks, one of the biggest obstacles is the severe inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this article, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the immensely popular back-propagation of errors. Grounded in the neurocognitive theory of predictive processing, our model adapts its synapses in a biologically-plausible fashion, while another, complementary neural system rapidly learns to direct and control this cortex-like structure by mimicking the task-executive control functionality of the basal ganglia. In our experiments, we demonstrate that our self-organizing system experiences significantly less forgetting as compared to standard neural models and outperforms a wide swath of previously proposed methods even though it is trained across task datasets in a stream-like fashion. The promising performance of our complementary system on benchmarks, e.g., SplitMNIST, Split Fashion MNIST, and Split NotMNIST, offers evidence that by incorporating mechanisms prominent in real neuronal systems, such as competition, sparse activation patterns, and iterative input processing, a new possibility for tackling the grand challenge of lifelong machine learning opens up.Comment: Key updates including results on standard benchmarks, e.g., split mnist/fmnist/not-mnist. Task selection/basal ganglia model has been integrate

    Assessing the conservation value of waterbodies: the example of the Loire floodplain (France)

    Get PDF
    In recent decades, two of the main management tools used to stem biodiversity erosion have been biodiversity monitoring and the conservation of natural areas. However, socio-economic pressure means that it is not usually possible to preserve the entire landscape, and so the rational prioritisation of sites has become a crucial issue. In this context, and because floodplains are one of the most threatened ecosystems, we propose a statistical strategy for evaluating conservation value, and used it to prioritise 46 waterbodies in the Loire floodplain (France). We began by determining a synthetic conservation index of fish communities (Q) for each waterbody. This synthetic index includes a conservation status index, an origin index, a rarity index and a richness index. We divided the waterbodies into 6 clusters with distinct structures of the basic indices. One of these clusters, with high Q median value, indicated that 4 waterbodies are important for fish biodiversity conservation. Conversely, two clusters with low Q median values included 11 waterbodies where restoration is called for. The results picked out high connectivity levels and low abundance of aquatic vegetation as the two main environmental characteristics of waterbodies with high conservation value. In addition, assessing the biodiversity and conservation value of territories using our multi-index approach plus an a posteriori hierarchical classification methodology reveals two major interests: (i) a possible geographical extension and (ii) a multi-taxa adaptation

    From neural PCA to deep unsupervised learning

    Full text link
    A network supporting deep unsupervised learning is presented. The network is an autoencoder with lateral shortcut connections from the encoder to decoder at each level of the hierarchy. The lateral shortcut connections allow the higher levels of the hierarchy to focus on abstract invariant features. While standard autoencoders are analogous to latent variable models with a single layer of stochastic variables, the proposed network is analogous to hierarchical latent variables models. Learning combines denoising autoencoder and denoising sources separation frameworks. Each layer of the network contributes to the cost function a term which measures the distance of the representations produced by the encoder and the decoder. Since training signals originate from all levels of the network, all layers can learn efficiently even in deep networks. The speedup offered by cost terms from higher levels of the hierarchy and the ability to learn invariant features are demonstrated in experiments.Comment: A revised version of an article that has been accepted for publication in Advances in Independent Component Analysis and Learning Machines (2015), edited by Ella Bingham, Samuel Kaski, Jorma Laaksonen and Jouko Lampine

    Semi-Supervised Speech Emotion Recognition with Ladder Networks

    Full text link
    Speech emotion recognition (SER) systems find applications in various fields such as healthcare, education, and security and defense. A major drawback of these systems is their lack of generalization across different conditions. This problem can be solved by training models on large amounts of labeled data from the target domain, which is expensive and time-consuming. Another approach is to increase the generalization of the models. An effective way to achieve this goal is by regularizing the models through multitask learning (MTL), where auxiliary tasks are learned along with the primary task. These methods often require the use of labeled data which is computationally expensive to collect for emotion recognition (gender, speaker identity, age or other emotional descriptors). This study proposes the use of ladder networks for emotion recognition, which utilizes an unsupervised auxiliary task. The primary task is a regression problem to predict emotional attributes. The auxiliary task is the reconstruction of intermediate feature representations using a denoising autoencoder. This auxiliary task does not require labels so it is possible to train the framework in a semi-supervised fashion with abundant unlabeled data from the target domain. This study shows that the proposed approach creates a powerful framework for SER, achieving superior performance than fully supervised single-task learning (STL) and MTL baselines. The approach is implemented with several acoustic features, showing that ladder networks generalize significantly better in cross-corpus settings. Compared to the STL baselines, the proposed approach achieves relative gains in concordance correlation coefficient (CCC) between 3.0% and 3.5% for within corpus evaluations, and between 16.1% and 74.1% for cross corpus evaluations, highlighting the power of the architecture

    An Analysis of the Connections Between Layers of Deep Neural Networks

    Full text link
    We present an analysis of different techniques for selecting the connection be- tween layers of deep neural networks. Traditional deep neural networks use ran- dom connection tables between layers to keep the number of connections small and tune to different image features. This kind of connection performs adequately in supervised deep networks because their values are refined during the training. On the other hand, in unsupervised learning, one cannot rely on back-propagation techniques to learn the connections between layers. In this work, we tested four different techniques for connecting the first layer of the network to the second layer on the CIFAR and SVHN datasets and showed that the accuracy can be im- proved up to 3% depending on the technique used. We also showed that learning the connections based on the co-occurrences of the features does not confer an advantage over a random connection table in small networks. This work is helpful to improve the efficiency of connections between the layers of unsupervised deep neural networks

    Aerodynamic Parameters Estimation Using Radial Basis Function Neural Partial Differentiation Method

    Get PDF
    Aerodynamic parameter estimation involves modelling of force and moment coefficients and computation of stability and control derivatives from recorded flight data. This problem is extensively studied in the past using classical approaches such as output error, filter error and equation error methods. An alternative approach to these model based methods is the machine learning such as artificial neural network. In this paper, radial basis function neural network (RBF NN) is used to model the lateral-directional force and moment coefficients. The RBF NN is trained using k-means clustering algorithm for finding the centers of radial basis function and extended Kalman filter for obtaining the weights in the output layer. Then, a new method is proposed to obtain the stability and control derivatives. The first order partial differentiation is performed analytically on the radial basis function neural network approximated output. The stability and control derivatives are computed at each training data point, thus reducing the post training time and computational efforts compared to hitherto delta method and its variants. The efficacy of the identified model and proposed neural derivative method is demonstrated using real time flight data of ATTAS aircraft. The results from the proposed approach compare well with those from the other
    • …
    corecore