662 research outputs found
Statistical physics and practical training of soft-committee machines
Equilibrium states of large layered neural networks with differentiable
activation function and a single, linear output unit are investigated using the
replica formalism. The quenched free energy of a student network with a very
large number of hidden units learning a rule of perfectly matching complexity
is calculated analytically. The system undergoes a first order phase transition
from unspecialized to specialized student configurations at a critical size of
the training set. Computer simulations of learning by stochastic gradient
descent from a fixed training set demonstrate that the equilibrium results
describe quantitatively the plateau states which occur in practical training
procedures at sufficiently small but finite learning rates.Comment: 11 pages, 4 figure
Off-lattice Kinetic Monte Carlo simulations of strained heteroepitaxial growth
An off-lattice, continuous space Kinetic Monte Carlo (KMC) algorithm is
discussed and applied in the investigation of strained heteroepitaxial crystal
growth. As a starting point, we study a simplifying (1+1)-dimensional situation
with inter-atomic interactions given by simple pair-potentials. The model
exhibits the appearance of strain-induced misfit dislocations at a
characteristic film thickness. In our simulations we observe a power law
dependence of this critical thickness on the lattice misfit, which is in
agreement with experimental results for semiconductor compounds. We furthermore
investigate the emergence of strain induced multilayer islands or "Dots" upon
an adsorbate wetting layer in the so-called Stranski-Krastanov (SK) growth
mode. At a characteristic kinetic film thickness, a transition from monolayer
to multilayer islands occurs. We discuss the microscopic causes of the
SK-transition and its dependence on the model parameters, i.e. lattice misfit,
growth rate, and substrate temperature.Comment: 17 pages, 6 figures Invited talk presented at the MFO Workshop
"Multiscale modeling in epitaxial growth" (Oberwolfach, Jan. 2004).
Proceedings to be published in "International Series in Numerical
Mathematics" (Birkhaeuser
The Shallow and the Deep:A biased introduction to neural networks and old school machine learning
The Shallow and the Deep is a collection of lecture notes that offers an accessible introduction to neural networks and machine learning in general. However, it was clear from the beginning that these notes would not be able to cover this rapidly changing and growing field in its entirety. The focus lies on classical machine learning techniques, with a bias towards classification and regression. Other learning paradigms and many recent developments in, for instance, Deep Learning are not addressed or only briefly touched upon.Biehl argues that having a solid knowledge of the foundations of the field is essential, especially for anyone who wants to explore the world of machine learning with an ambition that goes beyond the application of some software package to some data set. Therefore, The Shallow and the Deep places emphasis on fundamental concepts and theoretical background. This also involves delving into the history and pre-history of neural networks, where the foundations for most of the recent developments were laid. These notes aim to demystify machine learning and neural networks without losing the appreciation for their impressive power and versatility
The Statistical Physics of Learning Revisited:Typical Learning Curves in Model Scenarios
The exchange of ideas between computer science and statistical physics has advanced the understanding of machine learning and inference significantly. This interdisciplinary approach is currently regaining momentum due to the revived interest in neural networks and deep learning. Methods borrowed from statistical mechanics complement other approaches to the theory of computational and statistical learning. In this brief review, we outline and illustrate some of the basic concepts. We exemplify the role of the statistical physics approach in terms of a particularly important contribution: the computation of typical learning curves in student teacher scenarios of supervised learning. Two, by now classical examples from the literature illustrate the approach: the learning of a linearly separable rule by a perceptron with continuous and with discrete weights, respectively. We address these prototypical problems in terms of the simplifying limit of stochastic training at high formal temperature and obtain the corresponding learning curves.</p
- …