626 research outputs found

    Statistical physics and practical training of soft-committee machines

    Full text link
    Equilibrium states of large layered neural networks with differentiable activation function and a single, linear output unit are investigated using the replica formalism. The quenched free energy of a student network with a very large number of hidden units learning a rule of perfectly matching complexity is calculated analytically. The system undergoes a first order phase transition from unspecialized to specialized student configurations at a critical size of the training set. Computer simulations of learning by stochastic gradient descent from a fixed training set demonstrate that the equilibrium results describe quantitatively the plateau states which occur in practical training procedures at sufficiently small but finite learning rates.Comment: 11 pages, 4 figure

    Off-lattice Kinetic Monte Carlo simulations of strained heteroepitaxial growth

    Full text link
    An off-lattice, continuous space Kinetic Monte Carlo (KMC) algorithm is discussed and applied in the investigation of strained heteroepitaxial crystal growth. As a starting point, we study a simplifying (1+1)-dimensional situation with inter-atomic interactions given by simple pair-potentials. The model exhibits the appearance of strain-induced misfit dislocations at a characteristic film thickness. In our simulations we observe a power law dependence of this critical thickness on the lattice misfit, which is in agreement with experimental results for semiconductor compounds. We furthermore investigate the emergence of strain induced multilayer islands or "Dots" upon an adsorbate wetting layer in the so-called Stranski-Krastanov (SK) growth mode. At a characteristic kinetic film thickness, a transition from monolayer to multilayer islands occurs. We discuss the microscopic causes of the SK-transition and its dependence on the model parameters, i.e. lattice misfit, growth rate, and substrate temperature.Comment: 17 pages, 6 figures Invited talk presented at the MFO Workshop "Multiscale modeling in epitaxial growth" (Oberwolfach, Jan. 2004). Proceedings to be published in "International Series in Numerical Mathematics" (Birkhaeuser

    The Shallow and the Deep:A biased introduction to neural networks and old school machine learning

    Get PDF
    The Shallow and the Deep is a collection of lecture notes that offers an accessible introduction to neural networks and machine learning in general. However, it was clear from the beginning that these notes would not be able to cover this rapidly changing and growing field in its entirety. The focus lies on classical machine learning techniques, with a bias towards classification and regression. Other learning paradigms and many recent developments in, for instance, Deep Learning are not addressed or only briefly touched upon.Biehl argues that having a solid knowledge of the foundations of the field is essential, especially for anyone who wants to explore the world of machine learning with an ambition that goes beyond the application of some software package to some data set. Therefore, The Shallow and the Deep places emphasis on fundamental concepts and theoretical background. This also involves delving into the history and pre-history of neural networks, where the foundations for most of the recent developments were laid. These notes aim to demystify machine learning and neural networks without losing the appreciation for their impressive power and versatility

    The Statistical Physics of Learning Revisited:Typical Learning Curves in Model Scenarios

    Get PDF
    The exchange of ideas between computer science and statistical physics has advanced the understanding of machine learning and inference significantly. This interdisciplinary approach is currently regaining momentum due to the revived interest in neural networks and deep learning. Methods borrowed from statistical mechanics complement other approaches to the theory of computational and statistical learning. In this brief review, we outline and illustrate some of the basic concepts. We exemplify the role of the statistical physics approach in terms of a particularly important contribution: the computation of typical learning curves in student teacher scenarios of supervised learning. Two, by now classical examples from the literature illustrate the approach: the learning of a linearly separable rule by a perceptron with continuous and with discrete weights, respectively. We address these prototypical problems in terms of the simplifying limit of stochastic training at high formal temperature and obtain the corresponding learning curves.</p
    • …
    corecore