934 research outputs found

    Generalization in Deep Learning

    Full text link
    This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature. We also discuss approaches to provide non-vacuous generalization guarantees for deep learning. Based on theoretical observations, we propose new open problems and discuss the limitations of our results.Comment: To appear in Mathematics of Deep Learning, Cambridge University Press. All previous results remain unchange

    Generalization in Deep Learning

    Get PDF
    With a direct analysis of neural networks, this paper presents a mathematically tight generalization theory to partially address an open problem regarding the generalization of deep learning. Unlike previous bound-based theory, our main theory is quantitatively as tight as possible for every dataset individually, while producing qualitative insights competitively. Our results give insight into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, answering to an open question in the literature. We also discuss limitations of our results and propose additional open problems

    Using an Hebbian learning rule for multi-class SVM classifiers.

    Get PDF
    http://journals.kluweronline.com/article.asp?PIPS=5384399Regarding biological visual classification, recent series of experiments have enlighten the fact that data classification can be realized in the human visual cortex with latencies of about 100-150 ms, which, considering the visual pathways latencies, is only compatible with a very specific processing architecture, described by models from Thorpe et al. Surprisingly enough, this experimental evidence is in coherence with algorithms derived from the statistical learning theory. More precisely, there is a double link: on one hand, the so-called Vapnik theory offers tools to evaluate and analyze the biological model performances and on the other hand, this model is an interesting front-end for algorithms derived from the Vapnik theory. The present contribution develops this idea, introducing a model derived from the statistical learning theory and using the biological model of Thorpe et al. We experiment its performances using a restrained sign language recognition experiment. This paper intends to be read by biologist as well as statistician, as a consequence basic material in both fields have been reviewed
    corecore