13,174 research outputs found
Neural Network Methods for Boundary Value Problems Defined in Arbitrarily Shaped Domains
Partial differential equations (PDEs) with Dirichlet boundary conditions
defined on boundaries with simple geometry have been succesfuly treated using
sigmoidal multilayer perceptrons in previous works. This article deals with the
case of complex boundary geometry, where the boundary is determined by a number
of points that belong to it and are closely located, so as to offer a
reasonable representation. Two networks are employed: a multilayer perceptron
and a radial basis function network. The later is used to account for the
satisfaction of the boundary conditions. The method has been successfuly tested
on two-dimensional and three-dimensional PDEs and has yielded accurate
solutions
Ensemble learning of linear perceptron; Online learning theory
Within the framework of on-line learning, we study the generalization error
of an ensemble learning machine learning from a linear teacher perceptron. The
generalization error achieved by an ensemble of linear perceptrons having
homogeneous or inhomogeneous initial weight vectors is precisely calculated at
the thermodynamic limit of a large number of input elements and shows rich
behavior. Our main findings are as follows. For learning with homogeneous
initial weight vectors, the generalization error using an infinite number of
linear student perceptrons is equal to only half that of a single linear
perceptron, and converges with that of the infinite case with O(1/K) for a
finite number of K linear perceptrons. For learning with inhomogeneous initial
weight vectors, it is advantageous to use an approach of weighted averaging
over the output of the linear perceptrons, and we show the conditions under
which the optimal weights are constant during the learning process. The optimal
weights depend on only correlation of the initial weight vectors.Comment: 14 pages, 3 figures, submitted to Physical Review
Neural Relax
We present an algorithm for data preprocessing of an associative memory
inspired to an electrostatic problem that turns out to have intimate relations
with information maximization
Storage capacity of a constructive learning algorithm
Upper and lower bounds for the typical storage capacity of a constructive
algorithm, the Tilinglike Learning Algorithm for the Parity Machine [M. Biehl
and M. Opper, Phys. Rev. A {\bf 44} 6888 (1991)], are determined in the
asymptotic limit of large training set sizes. The properties of a perceptron
with threshold, learning a training set of patterns having a biased
distribution of targets, needed as an intermediate step in the capacity
calculation, are determined analytically. The lower bound for the capacity,
determined with a cavity method, is proportional to the number of hidden units.
The upper bound, obtained with the hypothesis of replica symmetry, is close to
the one predicted by Mitchinson and Durbin [Biol. Cyber. {\bf 60} 345 (1989)].Comment: 13 pages, 1 figur
Theory of Interacting Neural Networks
In this contribution we give an overview over recent work on the theory of
interacting neural networks. The model is defined in Section 2. The typical
teacher/student scenario is considered in Section 3. A static teacher network
is presenting training examples for an adaptive student network. In the case of
multilayer networks, the student shows a transition from a symmetric state to
specialisation. Neural networks can also generate a time series. Training on
time series and predicting it are studied in Section 4. When a network is
trained on its own output, it is interacting with itself. Such a scenario has
implications on the theory of prediction algorithms, as discussed in Section 5.
When a system of networks is trained on its minority decisions, it may be
considered as a model for competition in closed markets, see Section 6. In
Section 7 we consider two mutually interacting networks. A novel phenomenon is
observed: synchronisation by mutual learning. In Section 8 it is shown, how
this phenomenon can be applied to cryptography: Generation of a secret key over
a public channel.Comment: Contribution to Networks, ed. by H.G. Schuster and S. Bornholdt, to
be published by Wiley VC
Interacting neural networks and cryptography
Two neural networks which are trained on their mutual output bits are
analysed using methods of statistical physics. The exact solution of the
dynamics of the two weight vectors shows a novel phenomenon: The networks
synchronize to a state with identical time dependent weights. Extending the
models to multilayer networks with discrete weights, it is shown how
synchronization by mutual learning can be applied to secret key exchange over a
public channel.Comment: Invited talk for the meeting of the German Physical Societ
How deep is deep enough? -- Quantifying class separability in the hidden layers of deep neural networks
Deep neural networks typically outperform more traditional machine learning
models in their ability to classify complex data, and yet is not clear how the
individual hidden layers of a deep network contribute to the overall
classification performance. We thus introduce a Generalized Discrimination
Value (GDV) that measures, in a non-invasive manner, how well different data
classes separate in each given network layer. The GDV can be used for the
automatic tuning of hyper-parameters, such as the width profile and the total
depth of a network. Moreover, the layer-dependent GDV(L) provides new insights
into the data transformations that self-organize during training: In the case
of multi-layer perceptrons trained with error backpropagation, we find that
classification of highly complex data sets requires a temporal {\em reduction}
of class separability, marked by a characteristic 'energy barrier' in the
initial part of the GDV(L) curve. Even more surprisingly, for a given data set,
the GDV(L) is running through a fixed 'master curve', independently from the
total number of network layers. Furthermore, applying the GDV to Deep Belief
Networks reveals that also unsupervised training with the Contrastive
Divergence method can systematically increase class separability over tens of
layers, even though the system does not 'know' the desired class labels. These
results indicate that the GDV may become a useful tool to open the black box of
deep learning
- …