Search CORE

414,908 research outputs found

Phase transitions in soft-committee machines

Author: Biehl M.
Biehl M.
Chauvin Y.
E Schlösser
Hertz J. A.
M Ahr
M Biehl
Opper M.
Saad D. (Editor)
Schottky B.
Schottky B.
Schwarze H.
Schwarze H.
Urbanczik R.
Urbanczik R.
Vicente R.
Publication venue: 'IOP Publishing'
Publication date: 01/01/1998
Field of study

Equilibrium statistical physics is applied to layered neural networks with differentiable activation functions. A first analysis of off-line learning in soft-committee machines with a finite number (K) of hidden units learning a perfectly matching rule is performed. Our results are exact in the limit of high training temperatures. For K=2 we find a second order phase transition from unspecialized to specialized student configurations at a critical size P of the training set, whereas for K > 2 the transition is first order. Monte Carlo simulations indicate that our results are also valid for moderately low temperatures qualitatively. The limit K to infinity can be performed analytically, the transition occurs after presenting on the order of N K examples. However, an unspecialized metastable state persists up to P= O (N K^2).Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Proceedings - University of Groningen

EDP Sciences OAI-PMH repository (1.2.0)

University of Groningen

ARTS repository - University of Groningen

University of Groningen Digital Archive

Dissertations of the University of Groningen

Learning and generalization theories of large committee--machines

Author: Monasson Remi
Zecchina Riccardo
Publication venue
Publication date: 01/01/1996
Field of study

The study of the distribution of volumes associated to the internal representations of learning examples allows us to derive the critical learning capacity (

\alpha_c=\frac{16}{\pi} \sqrt{\ln K}

) of large committee machines, to verify the stability of the solution in the limit of a large number

K

of hidden units and to find a Bayesian generalization cross--over at

\alpha=K

.Comment: 14 pages, revte

arXiv.org e-Print Archive

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Statistical physics and practical training of soft-committee machines

Author: Ahr Martin
Biehl Michael
Urbanczik Robert
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/12/1998
Field of study

Equilibrium states of large layered neural networks with differentiable activation function and a single, linear output unit are investigated using the replica formalism. The quenched free energy of a student network with a very large number of hidden units learning a rule of perfectly matching complexity is calculated analytically. The system undergoes a first order phase transition from unspecialized to specialized student configurations at a critical size of the training set. Computer simulations of learning by stochastic gradient descent from a fixed training set demonstrate that the equilibrium results describe quantitatively the plateau states which occur in practical training procedures at sufficiently small but finite learning rates.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Proceedings - University of Groningen

University of Groningen

EDP Sciences OAI-PMH repository (1.2.0)

ARTS repository - University of Groningen

University of Groningen Digital Archive

Dissertations of the University of Groningen

Finite size scaling in neural networks

Author: A. Priel
B. Derrida
B. Efron
C. Williams
D. Mitchell
D. Stauffer
E. Barkai
E. Barkai
E. Eisenstein
E. Gardner
E. M. Palmer
G. J. Mitchison
H. Horner
H. Horner
H. M. Köhler
H.-K. Patel
I. Kanter
J. Hertz
S. Kirkpatrick
S. Kirkpatrick
T. M. Cover
W. Krauth
W. Krauth
Walter Nadler
Wolfgang Fink
Publication venue: 'American Physical Society (APS)'
Publication date: 05/11/1996
Field of study

We demonstrate that the fraction of pattern sets that can be stored in single- and hidden-layer perceptrons exhibits finite size scaling. This feature allows to estimate the critical storage capacity \alpha_c from simulations of relatively small systems. We illustrate this approach by determining \alpha_c, together with the finite size scaling exponent \nu, for storing Gaussian patterns in committee and parity machines with binary couplings and up to K=5 hidden units.Comment: 4 pages, RevTex, 5 figures, uses multicol.sty and psfig.st

arXiv.org e-Print Archive

Crossref

Learning in ultrametric committee machines

Author: Neirotti Juan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2012
Field of study

The problem of learning by examples in ultrametric committee machines (UCMs) is studied within the framework of statistical mechanics. Using the replica formalism we calculate the average generalization error in UCMs with L hidden layers and for a large enough number of units. In most of the regimes studied we find that the generalization error, as a function of the number of examples presented, develops a discontinuous drop at a critical value of the load parameter. We also find that when L>1 a number of teacher networks with the same number of hidden layers and different overlaps induce learning processes with the same critical points

Aston Publications Explorer

Correlation of internal representations in feed-forward neural networks

Author: A Engel
Erichsen R
Gardner E
Kepler T B
Knuth D E
Majer P
Malzahn D
Mezard M
Schottky B
Wong K Y M
Publication venue: 'IOP Publishing'
Publication date: 01/01/1996
Field of study

Feed-forward multilayer neural networks implementing random input-output mappings develop characteristic correlations between the activity of their hidden nodes which are important for the understanding of the storage and generalization performance of the network. It is shown how these correlations can be calculated from the joint probability distribution of the aligning fields at the hidden units for arbitrary decoder function between hidden layer and output. Explicit results are given for the parity-, and-, and committee-machines with arbitrary number of hidden nodes near saturation.Comment: 6 pages, latex, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Crossref

Functional Optimisation of Online Algorithms in Multilayer Neural Networks

Author: Amari S
Amari S
Biehl M
Biehl M
Biehl M
Biehl M
Biehl M
Copelli M
Copelli M
Kinouchi O
Kinouchi O
Kinouchi O
N Caticha
Opper M
R Vicente
Saad D
Simonetti R
Vicente R
West A H L
Publication venue: 'IOP Publishing'
Publication date: 02/06/1997
Field of study

We study the online dynamics of learning in fully connected soft committee machines in the student-teacher scenario. The locally optimal modulation function, which determines the learning algorithm, is obtained from a variational argument in such a manner as to maximise the average generalisation error decay per example. Simulations results for the resulting algorithm are presented for a few cases. The symmetric phase plateaux are found to be vastly reduced in comparison to those found when online backpropagation algorithms are used. A discussion of the implementation of these ideas as practical algorithms is given

arXiv.org e-Print Archive

Crossref

Storage capacity of ultrametric committee machines

Author: Neirotti J.P.
Publication venue: 'IOP Publishing'
Publication date: 14/02/2014
Field of study

The problem of computing the storage capacity of a feed-forward network, with L hidden layers, N inputs, and K units in the first hidden layer, is analyzed using techniques from statistical mechanics. We found that the storage capacity strongly depends on the network architecture αc ∼ (log K)1-1/2L and that the number of units K limits the number of possible hidden layers L through the relationship 2L - 1 < 2log K

Aston Publications Explorer

Computational capabilities of multilayer committee machines

Author: Ahr M
Ben-Or M
Copelli M
Copelli M
Cousseau F Mimura K Okada M
Engel A
Hartmanis J
J P Neirotti
Kinzel W
Kolmogorov A N
L Franco
Li M
Mezard M
Neirotti J P
Neirotti J P
O'Donnel R W
Saad D
Schwarze H
Schwarze H
Stefankovic D
Urbanczik R
Urbanczik R
Vicente R
Publication venue: 'IOP Publishing'
Publication date: 01/01/2010
Field of study

We obtained an analytical expression for the computational complexity of many layered committee machines with a finite number of hidden layers (L < 8) using the generalization complexity measure introduced by Franco et al (2006) IEEE Trans. Neural Netw. 17 578. Although our result is valid in the large-size limit and for an overlap synaptic matrix that is ultrametric, it provides a useful tool for inferring the appropriate architecture a network must have to reproduce an arbitrary realizable Boolean function

CiteSeerX

Crossref

Aston Publications Explorer

Weight Space Structure and Internal Representations: a Direct Approach to Learning and Generalization in Multilayer Neural Network

Author: Monasson R.
Zecchina R.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/1995
Field of study

We analytically derive the geometrical structure of the weight space in multilayer neural networks (MLN), in terms of the volumes of couplings associated to the internal representations of the training set. Focusing on the parity and committee machines, we deduce their learning and generalization capabilities both reinterpreting some known properties and finding new exact results. The relationship between our approach and information theory as well as the Mitchison--Durbin calculation is established. Our results are exact in the limit of a large number of hidden units, showing that MLN are a class of exactly solvable models with a simple interpretation of replica symmetry breaking.Comment: 12 pages, 1 compressed ps figure (uufile), RevTeX fil

arXiv.org e-Print Archive

CiteSeerX

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino