680 research outputs found
Statistical Mechanics of Soft Margin Classifiers
We study the typical learning properties of the recently introduced Soft
Margin Classifiers (SMCs), learning realizable and unrealizable tasks, with the
tools of Statistical Mechanics. We derive analytically the behaviour of the
learning curves in the regime of very large training sets. We obtain
exponential and power laws for the decay of the generalization error towards
the asymptotic value, depending on the task and on general characteristics of
the distribution of stabilities of the patterns to be learned. The optimal
learning curves of the SMCs, which give the minimal generalization error, are
obtained by tuning the coefficient controlling the trade-off between the error
and the regularization terms in the cost function. If the task is realizable by
the SMC, the optimal performance is better than that of a hard margin Support
Vector Machine and is very close to that of a Bayesian classifier.Comment: 26 pages, 12 figures, submitted to Physical Review
Beyond neural scaling laws: beating power law scaling via data pruning
Widely observed neural scaling laws, in which error falls off as a power of
the training set size, model size, or both, have driven substantial performance
improvements in deep learning. However, these improvements through scaling
alone require considerable costs in compute and energy. Here we focus on the
scaling of error with dataset size and show how in theory we can break beyond
power law scaling and potentially even reduce it to exponential scaling instead
if we have access to a high-quality data pruning metric that ranks the order in
which training examples should be discarded to achieve any pruned dataset size.
We then test this improved scaling prediction with pruned dataset size
empirically, and indeed observe better than power law scaling in practice on
ResNets trained on CIFAR-10, SVHN, and ImageNet. Next, given the importance of
finding high-quality pruning metrics, we perform the first large-scale
benchmarking study of ten different data pruning metrics on ImageNet. We find
most existing high performing metrics scale poorly to ImageNet, while the best
are computationally intensive and require labels for every image. We therefore
developed a new simple, cheap and scalable self-supervised pruning metric that
demonstrates comparable performance to the best supervised metrics. Overall,
our work suggests that the discovery of good data-pruning metrics may provide a
viable path forward to substantially improved neural scaling laws, thereby
reducing the resource costs of modern deep learning.Comment: Outstanding Paper Award @ NeurIPS 2022. Added github link to metric
score
A survey on online active learning
Online active learning is a paradigm in machine learning that aims to select
the most informative data points to label from a data stream. The problem of
minimizing the cost associated with collecting labeled observations has gained
a lot of attention in recent years, particularly in real-world applications
where data is only available in an unlabeled form. Annotating each observation
can be time-consuming and costly, making it difficult to obtain large amounts
of labeled data. To overcome this issue, many active learning strategies have
been proposed in the last decades, aiming to select the most informative
observations for labeling in order to improve the performance of machine
learning models. These approaches can be broadly divided into two categories:
static pool-based and stream-based active learning. Pool-based active learning
involves selecting a subset of observations from a closed pool of unlabeled
data, and it has been the focus of many surveys and literature reviews.
However, the growing availability of data streams has led to an increase in the
number of approaches that focus on online active learning, which involves
continuously selecting and labeling observations as they arrive in a stream.
This work aims to provide an overview of the most recently proposed approaches
for selecting the most informative observations from data streams in the
context of online active learning. We review the various techniques that have
been proposed and discuss their strengths and limitations, as well as the
challenges and opportunities that exist in this area of research. Our review
aims to provide a comprehensive and up-to-date overview of the field and to
highlight directions for future work
Estimating wind turbine generators failures using machine learning
The objective of this thesis is to estimate failures of wind turbine generators, using real data. It will seek to predict the failure and model it's reliability.In order to achieve this goal, machine learning algorithms, such as neural networks, support vector machines and decision trees will be used
A hybrid approach
Costa-Mendes, R., Oliveira, T., Castelli, M., & Cruz-Jesus, F. (2021). A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach. Education and Information Technologies, 26(2), 1527-1547. https://doi.org/10.1007/s10639-020-10316-yThis article uses an anonymous 2014–15 school year dataset from the Directorate-General for Statistics of Education and Science (DGEEC) of the Portuguese Ministry of Education as a means to carry out a predictive power comparison between the classic multilinear regression model and a chosen set of machine learning algorithms. A multilinear regression model is used in parallel with random forest, support vector machine, artificial neural network and extreme gradient boosting machine stacking ensemble implementations. Designing a hybrid analysis is intended where classical statistical analysis and artificial intelligence algorithms are blended to augment the ability to retain valuable conclusions and well-supported results. The machine learning algorithms attain a higher level of predictive ability. In addition, the stacking appropriateness increases as the base learner output correlation matrix determinant increases and the random forest feature importance empirical distributions are correlated with the structure of p-values and the statistical significance test ascertains of the multiple linear model. An information system that supports the nationwide education system should be designed and further structured to collect meaningful and precise data about the full range of academic achievement antecedents. The article concludes that no evidence is found in favour of smaller classes.publishersversionpublishe
Neuronal Models of Motor Sequence Learning in the Songbird
Communication of complex content is an important ability in our everyday life. For communication to be possible, several requirements need to be met: The individual communicated to has to learn to associate a certain meaning with a given sound. In the brain, this sound is represented as a spatio-temporal pattern of spikes, which will thus have to be associated with a different spike pattern representing its meaning. In this thesis, models for associative learning in spiking neurons are introduced in chapters 6 and 7. There, a new biologically plausible learning mechanism is proposed, where a property of the neuronal dynamics - the hyperpolarization of a neuron after each spike it produces - is coupled with a homeostatic plasticity mechanism, which acts to balance inputs into the neuron. In chapter 6, the mechanism used is a version of spike timing dependent plasticity (STDP), a property that was experimentally observed: The direction and amplitude of synaptic change depends on the precise timing of pre- and postsynaptic spiking activity. This mechanism is applied to associative learning of output spikes in response to purely spatial spiking patterns. In chapter 7, a new learning rule is introduced, which is derived from the objective of a balanced membrane potential. This learning rule is shown to be equivalent to a version of STDP and applied to associative learning of precisely timed output spikes in response to spatio-temporal input patterns. The individual communicating has to learn to reproduce certain sounds (which can be associated with a given meaning). To that end, a memory of the sound sequence has to be formed. Since sound sequences are represented as sequences of activation patterns in the brain, learning of a given sequence of spike patterns is an interesting problem for theoretical considerations Here, it is shown that the biologically plausible learning mechanism introduced for associative learning enables recurrently coupled networks of spiking neurons to learn to reproduce given sequences of spikes. These results are presented in chapter 9. Finally, the communicator has to translate the sensory memory into motor actions that serve to reproduce the target sound. This process is investigated in the framework of inverse model learning, where the learner learns to invert the action-perception cycle by mapping perceptions back onto the actions that caused them. Two different setups for inverse model learning are investigated: In chapter 5, a simple setup for inverse model learning is coupled with the learning algorithm used for Perceptron learning in chapter 6 and it is shown that models of the sound generation and perception process, which are non-linear and non-local in time, can be inverted, if the width of the distribution of time delays of self-generated inputs caused by an individual motor spike is not too large. This limitation is mitigated by the model introduced in chapter 8. Both these models have experimentally testable consequences, namely a dip in the autocorrelation function of the spike times in the motor population of the duration of the loop delay, i.e. the time it takes for a motor activation to cause a sound and thus a sensory activation and the time that this sensory activation takes to be looped back to the motor population. Furthermore, both models predict neurons, which are active during the sound generation and during the passive playback of the sound with a time delay equivalent to the loop delay. Finally, the inverse model presented in chapter 8 additionally predicts mirror neurons without a time delay. Both types of mirror neurons have been observed in the songbird [GKGH14, PPNM08], a popular animal model for vocal imitation learning
- …