344 research outputs found
Dynamics of Learning with Restricted Training Sets I: General Theory
We study the dynamics of supervised learning in layered neural networks, in
the regime where the size of the training set is proportional to the number
of inputs. Here the local fields are no longer described by Gaussian
probability distributions and the learning dynamics is of a spin-glass nature,
with the composition of the training set playing the role of quenched disorder.
We show how dynamical replica theory can be used to predict the evolution of
macroscopic observables, including the two relevant performance measures
(training error and generalization error), incorporating the old formalism
developed for complete training sets in the limit as a
special case. For simplicity we restrict ourselves in this paper to
single-layer networks and realizable tasks.Comment: 39 pages, LaTe
Cluster derivation of Parisi's RSB solution for disordered systems
We propose a general scheme in which disordered systems are allowed to
sacrifice energy equi-partitioning and separate into a hierarchy of ergodic
sub-systems (clusters) with different characteristic time-scales and
temperatures. The details of the break-up follow from the requirement of
stationarity of the entropy of the slower cluster, at every level in the
hierarchy. We apply our ideas to the Sherrington-Kirkpatrick model, and show
how the Parisi solution can be {\it derived} quantitatively from plausible
physical principles. Our approach gives new insight into the physics behind
Parisi's solution and its relations with other theories, numerical experiments,
and short range models.Comment: 7 pages 5 figure
Dynamical Solution of the On-Line Minority Game
We solve the dynamics of the on-line minority game, with general types of
decision noise, using generating functional techniques a la De Dominicis and
the temporal regularization procedure of Bedeaux et al. The result is a
macroscopic dynamical theory in the form of closed equations for correlation-
and response functions defined via an effective continuous-time single-trader
process, which are exact in both the ergodic and in the non-ergodic regime of
the minority game. Our solution also explains why, although one cannot formally
truncate the Kramers-Moyal expansion of the process after the Fokker-Planck
term, upon doing so one still finds the correct solution, that the previously
proposed diffusion matrices for the Fokker-Planck term are incomplete, and how
previously proposed approximations of the market volatility can be traced back
to ergodicity assumptions.Comment: 25 pages LaTeX, no figure
Feed-Forward Chains of Recurrent Attractor Neural Networks Near Saturation
We perform a stationary state replica analysis for a layered network of Ising
spin neurons, with recurrent Hebbian interactions within each layer, in
combination with strictly feed-forward Hebbian interactions between successive
layers. This model interpolates between the fully recurrent and symmetric
attractor network studied by Amit el al, and the strictly feed-forward
attractor network studied by Domany et al. Due to the absence of detailed
balance, it is as yet solvable only in the zero temperature limit. The built-in
competition between two qualitatively different modes of operation,
feed-forward (ergodic within layers) versus recurrent (non- ergodic within
layers), is found to induce interesting phase transitions.Comment: 14 pages LaTex with 4 postscript figures submitted to J. Phys.
Slowly evolving geometry in recurrent neural networks I: extreme dilution regime
We study extremely diluted spin models of neural networks in which the
connectivity evolves in time, although adiabatically slowly compared to the
neurons, according to stochastic equations which on average aim to reduce
frustration. The (fast) neurons and (slow) connectivity variables equilibrate
separately, but at different temperatures. Our model is exactly solvable in
equilibrium. We obtain phase diagrams upon making the condensed ansatz (i.e.
recall of one pattern). These show that, as the connectivity temperature is
lowered, the volume of the retrieval phase diverges and the fraction of
mis-aligned spins is reduced. Still one always retains a region in the
retrieval phase where recall states other than the one corresponding to the
`condensed' pattern are locally stable, so the associative memory character of
our model is preserved.Comment: 18 pages, 6 figure
Random replicators with asymmetric couplings
Systems of interacting random replicators are studied using generating
functional techniques. While replica analyses of such models are limited to
systems with symmetric couplings, dynamical approaches as presented here allow
specifically to address cases with asymmetric interactions where there is no
Lyapunov function governing the dynamics. We here focus on replicator models
with Gaussian couplings of general symmetry between p>=2 species, and discuss
how an effective description of the dynamics can be derived in terms of a
single-species process. Upon making a fixed point ansatz persistent order
parameters in the ergodic stationary states can be extracted from this process,
and different types of phase transitions can be identified and related to each
other. We discuss the effects of asymmetry in the couplings on the order
parameters and the phase behaviour for p=2 and p=3. Numerical simulations
verify our theory. For the case of cubic interactions numerical experiments
indicate regimes in which only a finite number of species survives, even when
the thermodynamic limit is considered.Comment: revised version, removed some mathematical parts, discussion of
negatively correlated couplings added, figures adde
Generating Functional Analysis of the Dynamics of the Batch Minority Game with Random External Information
We study the dynamics of the batch minority game, with random external
information, using generating functional techniques a la De Dominicis. The
relevant control parameter in this model is the ratio of the
number of possible values for the external information over the number
of trading agents. In the limit we calculate the location
of the phase transition (signaling the onset of anomalous response),
and solve the statics for exactly. The temporal correlations
in global market fluctuations turn out not to decay to zero for infinitely
widely separated times. For the stationary state is shown to
be non-unique. For we analyse our equations in leading order in
, and find asymptotic solutions with diverging volatility
\sigma=\order(\alpha^{-{1/2}}) (as regularly observed in simulations), but
also asymptotic solutions with vanishing volatility
\sigma=\order(\alpha^{{1/2}}). The former, however, are shown to emerge only
if the agents' initial strategy valuations are below a specific critical value.Comment: 15 pages, 6 figures, uses Revtex. Replaced an old version of
volatility graph that. Rephrased and updated some reference
Noise, regularizers, and unrealizable scenarios in online learning from restricted training sets
We study the dynamics of on-line learning in multilayer neural networks where training examples are sampled with repetition and where the number of examples scales with the number of network weights. The analysis is carried out using the dynamical replica method aimed at obtaining a closed set of coupled equations for a set of macroscopic variables from which both training and generalization errors can be calculated. We focus on scenarios whereby training examples are corrupted by additive Gaussian output noise and regularizers are introduced to improve the network performance. The dependence of the dynamics on the noise level, with and without regularizers, is examined, as well as that of the asymptotic values obtained for both training and generalization errors. We also demonstrate the ability of the method to approximate the learning dynamics in structurally unrealizable scenarios. The theoretical results show good agreement with those obtained by computer simulations
The signal-to-noise analysis of the Little-Hopfield model revisited
Using the generating functional analysis an exact recursion relation is
derived for the time evolution of the effective local field of the fully
connected Little-Hopfield model. It is shown that, by leaving out the feedback
correlations arising from earlier times in this effective dynamics, one
precisely finds the recursion relations usually employed in the signal-to-noise
approach. The consequences of this approximation as well as the physics behind
it are discussed. In particular, it is pointed out why it is hard to notice the
effects, especially for model parameters corresponding to retrieval. Numerical
simulations confirm these findings. The signal-to-noise analysis is then
extended to include all correlations, making it a full theory for dynamics at
the level of the generating functional analysis. The results are applied to the
frequently employed extremely diluted (a)symmetric architectures and to
sequence processing networks.Comment: 26 pages, 3 figure
- âŠ