89 research outputs found

    Sparsity through evolutionary pruning prevents neuronal networks from overfitting

    Get PDF
    Modern Machine learning techniques take advantage of the exponentially rising calculation power in new generation processor units. Thus, the number of parameters which are trained to resolve complex tasks was highly increased over the last decades. However, still the networks fail - in contrast to our brain - to develop general intelligence in the sense of being able to solve several complex tasks with only one network architecture. This could be the case because the brain is not a randomly initialized neural network, which has to be trained by simply investing a lot of calculation power, but has from birth some fixed hierarchical structure. To make progress in decoding the structural basis of biological neural networks we here chose a bottom-up approach, where we evolutionarily trained small neural networks in performing a maze task. This simple maze task requires dynamical decision making with delayed rewards. We were able to show that during the evolutionary optimization random severance of connections lead to better generalization performance of the networks compared to fully connected networks. We conclude that sparsity is a central property of neural networks and should be considered for modern Machine learning approaches

    EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models

    Full text link
    Neural models are known to be over-parameterized, and recent work has shown that sparse text-to-speech (TTS) models can outperform dense models. Although a plethora of sparse methods has been proposed for other domains, such methods have rarely been applied in TTS. In this work, we seek to answer the question: what are the characteristics of selected sparse techniques on the performance and model complexity? We compare a Tacotron2 baseline and the results of applying five techniques. We then evaluate the performance via the factors of naturalness, intelligibility and prosody, while reporting model size and training time. Complementary to prior research, we find that pruning before or during training can achieve similar performance to pruning after training and can be trained much faster, while removing entire neurons degrades performance much more than removing parameters. To our best knowledge, this is the first work that compares sparsity paradigms in text-to-speech synthesis

    An Adaptive Locally Connected Neuron Model: Focusing Neuron

    Full text link
    This paper presents a new artificial neuron model capable of learning its receptive field in the topological domain of inputs. The model provides adaptive and differentiable local connectivity (plasticity) applicable to any domain. It requires no other tool than the backpropagation algorithm to learn its parameters which control the receptive field locations and apertures. This research explores whether this ability makes the neuron focus on informative inputs and yields any advantage over fully connected neurons. The experiments include tests of focusing neuron networks of one or two hidden layers on synthetic and well-known image recognition data sets. The results demonstrated that the focusing neurons can move their receptive fields towards more informative inputs. In the simple two-hidden layer networks, the focusing layers outperformed the dense layers in the classification of the 2D spatial data sets. Moreover, the focusing networks performed better than the dense networks even when 70%\% of the weights were pruned. The tests on convolutional networks revealed that using focusing layers instead of dense layers for the classification of convolutional features may work better in some data sets.Comment: 45 pages, a national patent filed, submitted to Turkish Patent Office, No: -2017/17601, Date: 09.11.201

    Always-Sparse Training by Growing Connections with Guided Stochastic Exploration

    Full text link
    The excessive computational requirements of modern artificial neural networks (ANNs) are posing limitations on the machines that can run them. Sparsification of ANNs is often motivated by time, memory and energy savings only during model inference, yielding no benefits during training. A growing body of work is now focusing on providing the benefits of model sparsification also during training. While these methods greatly improve the training efficiency, the training algorithms yielding the most accurate models still materialize the dense weights, or compute dense gradients during training. We propose an efficient, always-sparse training algorithm with excellent scaling to larger and sparser models, supported by its linear time complexity with respect to the model width during training and inference. Moreover, our guided stochastic exploration algorithm improves over the accuracy of previous sparse training methods. We evaluate our method on CIFAR-10/100 and ImageNet using ResNet, VGG, and ViT models, and compare it against a range of sparsification methods

    Model Compression Techniques in Biometrics Applications: A Survey

    Full text link
    The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity, limiting their usefulness in human-oriented applications, which are usually deployed in resource-constrained devices. This led to the development of compression techniques that drastically reduce the computational and memory costs of deep learning models without significant performance degradation. This paper aims to systematize the current literature on this topic by presenting a comprehensive survey of model compression techniques in biometrics applications, namely quantization, knowledge distillation and pruning. We conduct a critical analysis of the comparative value of these techniques, focusing on their advantages and disadvantages and presenting suggestions for future work directions that can potentially improve the current methods. Additionally, we discuss and analyze the link between model bias and model compression, highlighting the need to direct compression research toward model fairness in future works.Comment: Under review at IEEE Journa

    Integration of Leaky-Integrate-and-Fire-Neurons in Deep Learning Architectures

    Full text link
    Up to now, modern Machine Learning is mainly based on fitting high dimensional functions to enormous data sets, taking advantage of huge hardware resources. We show that biologically inspired neuron models such as the Leaky-Integrate-and-Fire (LIF) neurons provide novel and efficient ways of information encoding. They can be integrated in Machine Learning models, and are a potential target to improve Machine Learning performance. Thus, we derived simple update-rules for the LIF units from the differential equations, which are easy to numerically integrate. We apply a novel approach to train the LIF units supervisedly via backpropagation, by assigning a constant value to the derivative of the neuron activation function exclusively for the backpropagation step. This simple mathematical trick helps to distribute the error between the neurons of the pre-connected layer. We apply our method to the IRIS blossoms image data set and show that the training technique can be used to train LIF neurons on image classification tasks. Furthermore, we show how to integrate our method in the KERAS (tensorflow) framework and efficiently run it on GPUs. To generate a deeper understanding of the mechanisms during training we developed interactive illustrations, which we provide online. With this study we want to contribute to the current efforts to enhance Machine Intelligence by integrating principles from biology

    Analysis and visualization of sleep stages based on deep neural networks

    Get PDF
    Automatic sleep stage scoring based on deep neural networks has come into focus of sleep researchers and physicians, as a reliable method able to objectively classify sleep stages would save human resources and simplify clinical routines. Due to novel open-source software libraries for machine learning, in combination with enormous recent progress in hardware development, a paradigm shift in the field of sleep research towards automatic diagnostics might be imminent. We argue that modern machine learning techniques are not just a tool to perform automatic sleep stage classification, but are also a creative approach to find hidden properties of sleep physiology. We have already developed and established algorithms to visualize and cluster EEG data, facilitating first assessments on sleep health in terms of sleep-apnea and consequently reduced daytime vigilance. In the following study, we further analyze cortical activity during sleep by determining the probabilities of momentary sleep stages, represented as hypnodensity graphs and then computing vectorial cross-correlations of different EEG channels. We can show that this measure serves to estimate the period length of sleep cycles and thus can help to find disturbances due to pathological conditions.</p

    Analysis and visualization of sleep stages based on deep neural networks

    Get PDF
    Automatic sleep stage scoring based on deep neural networks has come into focus of sleep researchers and physicians, as a reliable method able to objectively classify sleep stages would save human resources and simplify clinical routines. Due to novel open-source software libraries for machine learning, in combination with enormous recent progress in hardware development, a paradigm shift in the field of sleep research towards automatic diagnostics might be imminent. We argue that modern machine learning techniques are not just a tool to perform automatic sleep stage classification, but are also a creative approach to find hidden properties of sleep physiology. We have already developed and established algorithms to visualize and cluster EEG data, facilitating first assessments on sleep health in terms of sleep-apnea and consequently reduced daytime vigilance. In the following study, we further analyze cortical activity during sleep by determining the probabilities of momentary sleep stages, represented as hypnodensity graphs and then computing vectorial cross-correlations of different EEG channels. We can show that this measure serves to estimate the period length of sleep cycles and thus can help to find disturbances due to pathological conditions.</p
    corecore