1,021 research outputs found
Optimisation of on-line principal component analysis
Different techniques, used to optimise on-line principal component analysis,
are investigated by methods of statistical mechanics. These include local and
global optimisation of node-dependent learning-rates which are shown to be very
efficient in speeding up the learning process. They are investigated further
for gaining insight into the learning rates' time-dependence, which is then
employed for devising simple practical methods to improve training performance.
Simulations demonstrate the benefit gained from using the new methods.Comment: 10 pages, 5 figure
Towards Designing Artificial Universes for Artificial Agents under Interaction Closure
We are interested in designing artificial universes for artificial agents. We view artificial agents as networks of highlevel processes on top of of a low-level detailed-description system. We require that the high-level processes have some intrinsic explanatory power and we introduce an extension of informational closure namely interaction closure to capture this. Then we derive a method to design artificial universes in the form of finite Markov chains which exhibit high-level processes that satisfy the property of interaction closure. We also investigate control or information transfer which we see as an building block for networks representing artificial agent
An empirical study of scanner system parameters
The selection of the current combination of parametric values (instantaneous field of view, number and location of spectral bands, signal-to-noise ratio, etc.) of a multispectral scanner is a complex problem due to the strong interrelationship these parameters have with one another. The study was done with the proposed scanner known as Thematic Mapper in mind. Since an adequate theoretical procedure for this problem has apparently not yet been devised, an empirical simulation approach was used with candidate parameter values selected by the heuristic means. The results obtained using a conventional maximum likelihood pixel classifier suggest that although the classification accuracy declines slightly as the IFOV is decreased this is more than made up by an improved mensuration accuracy. Further, the use of a classifier involving both spatial and spectral features shows a very substantial tendency to resist degradation as the signal-to-noise ratio is decreased. And finally, further evidence is provided of the importance of having at least one spectral band in each of the major available portions of the optical spectrum
Functional Optimisation of Online Algorithms in Multilayer Neural Networks
We study the online dynamics of learning in fully connected soft committee
machines in the student-teacher scenario. The locally optimal modulation
function, which determines the learning algorithm, is obtained from a
variational argument in such a manner as to maximise the average generalisation
error decay per example. Simulations results for the resulting algorithm are
presented for a few cases. The symmetric phase plateaux are found to be vastly
reduced in comparison to those found when online backpropagation algorithms are
used. A discussion of the implementation of these ideas as practical algorithms
is given
Apparent actions and apparent goal-directedness
Daniel Polani, Martin Biehl, ‘Apparent actions and apparent goal-directedness’, paper presented at the 13th European Conference on Artificial Life (ECAL 2015), York, UK, 20-24 July, 2015.In human history countless phenomena have been (wrongly) attributed to agents. For instance, now science believes there are no gods (agents) of lightning, thunder and wind behind the associated phenomena. In physics (assuming quantum decoherence) the universe is modelled as a state space with a dynamical law that determines everything that happens within it. This however, is incompatible with most notions of agency (cf. Barandiaran et al., 2009) which require actions: For an agent candidate to have actions it must be able to “make something happen” as opposed to only “have things happen to it”. Here we ask which single sequences of partial observations may appear to contain agency to a passive observer who has its own memory. For this we define measures of apparent actions and apparent goal-directedness. Goal-directedness is another feature commonly attributed to agents. We here ignore whatever causes the appearances and the concept of individuality of agents.Peer reviewedFinal Accepted Versio
Phase transitions in soft-committee machines
Equilibrium statistical physics is applied to layered neural networks with
differentiable activation functions. A first analysis of off-line learning in
soft-committee machines with a finite number (K) of hidden units learning a
perfectly matching rule is performed. Our results are exact in the limit of
high training temperatures. For K=2 we find a second order phase transition
from unspecialized to specialized student configurations at a critical size P
of the training set, whereas for K > 2 the transition is first order. Monte
Carlo simulations indicate that our results are also valid for moderately low
temperatures qualitatively. The limit K to infinity can be performed
analytically, the transition occurs after presenting on the order of N K
examples. However, an unspecialized metastable state persists up to P= O (N
K^2).Comment: 8 pages, 4 figure
Modeling one-dimensional island growth with mass-dependent detachment rates
We study one-dimensional models of particle diffusion and
attachment/detachment from islands where the detachment rates gamma(m) of
particles at the cluster edges increase with cluster mass m. They are expected
to mimic the effects of lattice mismatch with the substrate and/or long-range
repulsive interactions that work against the formation of long islands.
Short-range attraction is represented by an overall factor epsilon<<1 in the
detachment rates relatively to isolated particle hopping rates [epsilon ~
exp(-E/T), with binding energy E and temperature T]. We consider various
gamma(m), from rapidly increasing forms such as gamma(m) ~ m to slowly
increasing ones, such as gamma(m) ~ [m/(m+1)]^b. A mapping onto a column
problem shows that these systems are zero-range processes, whose steady states
properties are exactly calculated under the assumption of independent column
heights in the Master equation. Simulation provides island size distributions
which confirm analytic reductions and are useful whenever the analytical tools
cannot provide results in closed form. The shape of island size distributions
can be changed from monomodal to monotonically decreasing by tuning the
temperature or changing the particle density rho. Small values of the scaling
variable X=epsilon^{-1}rho/(1-rho) favour the monotonically decreasing ones.
However, for large X, rapidly increasing gamma(m) lead to distributions with
peaks very close to and rapidly decreasing tails, while slowly increasing
gamma(m) provide peaks close to /2$ and fat right tails.Comment: 16 pages, 6 figure
Analysis of dropout learning regarded as ensemble learning
Deep learning is the state-of-the-art in fields such as visual object
recognition and speech recognition. This learning uses a large number of
layers, huge number of units, and connections. Therefore, overfitting is a
serious problem. To avoid this problem, dropout learning is proposed. Dropout
learning neglects some inputs and hidden units in the learning process with a
probability, p, and then, the neglected inputs and hidden units are combined
with the learned network to express the final output. We find that the process
of combining the neglected hidden units with the learned network can be
regarded as ensemble learning, so we analyze dropout learning from this point
of view.Comment: 9 pages, 8 figures, submitted to Conferenc
Dynamics of Learning with Restricted Training Sets I: General Theory
We study the dynamics of supervised learning in layered neural networks, in
the regime where the size of the training set is proportional to the number
of inputs. Here the local fields are no longer described by Gaussian
probability distributions and the learning dynamics is of a spin-glass nature,
with the composition of the training set playing the role of quenched disorder.
We show how dynamical replica theory can be used to predict the evolution of
macroscopic observables, including the two relevant performance measures
(training error and generalization error), incorporating the old formalism
developed for complete training sets in the limit as a
special case. For simplicity we restrict ourselves in this paper to
single-layer networks and realizable tasks.Comment: 39 pages, LaTe
Finite-size effects in on-line learning of multilayer neural networks
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating fluctuations possessed by finite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time
- …