89 research outputs found
Sparsity through evolutionary pruning prevents neuronal networks from overfitting
Modern Machine learning techniques take advantage of the exponentially rising
calculation power in new generation processor units. Thus, the number of
parameters which are trained to resolve complex tasks was highly increased over
the last decades. However, still the networks fail - in contrast to our brain -
to develop general intelligence in the sense of being able to solve several
complex tasks with only one network architecture. This could be the case
because the brain is not a randomly initialized neural network, which has to be
trained by simply investing a lot of calculation power, but has from birth some
fixed hierarchical structure. To make progress in decoding the structural basis
of biological neural networks we here chose a bottom-up approach, where we
evolutionarily trained small neural networks in performing a maze task. This
simple maze task requires dynamical decision making with delayed rewards. We
were able to show that during the evolutionary optimization random severance of
connections lead to better generalization performance of the networks compared
to fully connected networks. We conclude that sparsity is a central property of
neural networks and should be considered for modern Machine learning
approaches
EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models
Neural models are known to be over-parameterized, and recent work has shown
that sparse text-to-speech (TTS) models can outperform dense models. Although a
plethora of sparse methods has been proposed for other domains, such methods
have rarely been applied in TTS. In this work, we seek to answer the question:
what are the characteristics of selected sparse techniques on the performance
and model complexity? We compare a Tacotron2 baseline and the results of
applying five techniques. We then evaluate the performance via the factors of
naturalness, intelligibility and prosody, while reporting model size and
training time. Complementary to prior research, we find that pruning before or
during training can achieve similar performance to pruning after training and
can be trained much faster, while removing entire neurons degrades performance
much more than removing parameters. To our best knowledge, this is the first
work that compares sparsity paradigms in text-to-speech synthesis
An Adaptive Locally Connected Neuron Model: Focusing Neuron
This paper presents a new artificial neuron model capable of learning its
receptive field in the topological domain of inputs. The model provides
adaptive and differentiable local connectivity (plasticity) applicable to any
domain. It requires no other tool than the backpropagation algorithm to learn
its parameters which control the receptive field locations and apertures. This
research explores whether this ability makes the neuron focus on informative
inputs and yields any advantage over fully connected neurons. The experiments
include tests of focusing neuron networks of one or two hidden layers on
synthetic and well-known image recognition data sets. The results demonstrated
that the focusing neurons can move their receptive fields towards more
informative inputs. In the simple two-hidden layer networks, the focusing
layers outperformed the dense layers in the classification of the 2D spatial
data sets. Moreover, the focusing networks performed better than the dense
networks even when 70 of the weights were pruned. The tests on
convolutional networks revealed that using focusing layers instead of dense
layers for the classification of convolutional features may work better in some
data sets.Comment: 45 pages, a national patent filed, submitted to Turkish Patent
Office, No: -2017/17601, Date: 09.11.201
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration
The excessive computational requirements of modern artificial neural networks
(ANNs) are posing limitations on the machines that can run them. Sparsification
of ANNs is often motivated by time, memory and energy savings only during model
inference, yielding no benefits during training. A growing body of work is now
focusing on providing the benefits of model sparsification also during
training. While these methods greatly improve the training efficiency, the
training algorithms yielding the most accurate models still materialize the
dense weights, or compute dense gradients during training. We propose an
efficient, always-sparse training algorithm with excellent scaling to larger
and sparser models, supported by its linear time complexity with respect to the
model width during training and inference. Moreover, our guided stochastic
exploration algorithm improves over the accuracy of previous sparse training
methods. We evaluate our method on CIFAR-10/100 and ImageNet using ResNet, VGG,
and ViT models, and compare it against a range of sparsification methods
Model Compression Techniques in Biometrics Applications: A Survey
The development of deep learning algorithms has extensively empowered
humanity's task automatization capacity. However, the huge improvement in the
performance of these models is highly correlated with their increasing level of
complexity, limiting their usefulness in human-oriented applications, which are
usually deployed in resource-constrained devices. This led to the development
of compression techniques that drastically reduce the computational and memory
costs of deep learning models without significant performance degradation. This
paper aims to systematize the current literature on this topic by presenting a
comprehensive survey of model compression techniques in biometrics
applications, namely quantization, knowledge distillation and pruning. We
conduct a critical analysis of the comparative value of these techniques,
focusing on their advantages and disadvantages and presenting suggestions for
future work directions that can potentially improve the current methods.
Additionally, we discuss and analyze the link between model bias and model
compression, highlighting the need to direct compression research toward model
fairness in future works.Comment: Under review at IEEE Journa
Integration of Leaky-Integrate-and-Fire-Neurons in Deep Learning Architectures
Up to now, modern Machine Learning is mainly based on fitting high
dimensional functions to enormous data sets, taking advantage of huge hardware
resources. We show that biologically inspired neuron models such as the
Leaky-Integrate-and-Fire (LIF) neurons provide novel and efficient ways of
information encoding. They can be integrated in Machine Learning models, and
are a potential target to improve Machine Learning performance.
Thus, we derived simple update-rules for the LIF units from the differential
equations, which are easy to numerically integrate. We apply a novel approach
to train the LIF units supervisedly via backpropagation, by assigning a
constant value to the derivative of the neuron activation function exclusively
for the backpropagation step. This simple mathematical trick helps to
distribute the error between the neurons of the pre-connected layer. We apply
our method to the IRIS blossoms image data set and show that the training
technique can be used to train LIF neurons on image classification tasks.
Furthermore, we show how to integrate our method in the KERAS (tensorflow)
framework and efficiently run it on GPUs. To generate a deeper understanding of
the mechanisms during training we developed interactive illustrations, which we
provide online.
With this study we want to contribute to the current efforts to enhance
Machine Intelligence by integrating principles from biology
Analysis and visualization of sleep stages based on deep neural networks
Automatic sleep stage scoring based on deep neural networks has come into focus of sleep researchers and physicians, as a reliable method able to objectively classify sleep stages would save human resources and simplify clinical routines. Due to novel open-source software libraries for machine learning, in combination with enormous recent progress in hardware development, a paradigm shift in the field of sleep research towards automatic diagnostics might be imminent. We argue that modern machine learning techniques are not just a tool to perform automatic sleep stage classification, but are also a creative approach to find hidden properties of sleep physiology. We have already developed and established algorithms to visualize and cluster EEG data, facilitating first assessments on sleep health in terms of sleep-apnea and consequently reduced daytime vigilance. In the following study, we further analyze cortical activity during sleep by determining the probabilities of momentary sleep stages, represented as hypnodensity graphs and then computing vectorial cross-correlations of different EEG channels. We can show that this measure serves to estimate the period length of sleep cycles and thus can help to find disturbances due to pathological conditions.</p
Analysis and visualization of sleep stages based on deep neural networks
Automatic sleep stage scoring based on deep neural networks has come into focus of sleep researchers and physicians, as a reliable method able to objectively classify sleep stages would save human resources and simplify clinical routines. Due to novel open-source software libraries for machine learning, in combination with enormous recent progress in hardware development, a paradigm shift in the field of sleep research towards automatic diagnostics might be imminent. We argue that modern machine learning techniques are not just a tool to perform automatic sleep stage classification, but are also a creative approach to find hidden properties of sleep physiology. We have already developed and established algorithms to visualize and cluster EEG data, facilitating first assessments on sleep health in terms of sleep-apnea and consequently reduced daytime vigilance. In the following study, we further analyze cortical activity during sleep by determining the probabilities of momentary sleep stages, represented as hypnodensity graphs and then computing vectorial cross-correlations of different EEG channels. We can show that this measure serves to estimate the period length of sleep cycles and thus can help to find disturbances due to pathological conditions.</p
- …