487 research outputs found
Theoretical Properties of Projection Based Multilayer Perceptrons with Functional Inputs
Many real world data are sampled functions. As shown by Functional Data
Analysis (FDA) methods, spectra, time series, images, gesture recognition data,
etc. can be processed more efficiently if their functional nature is taken into
account during the data analysis process. This is done by extending standard
data analysis methods so that they can apply to functional inputs. A general
way to achieve this goal is to compute projections of the functional data onto
a finite dimensional sub-space of the functional space. The coordinates of the
data on a basis of this sub-space provide standard vector representations of
the functions. The obtained vectors can be processed by any standard method. In
our previous work, this general approach has been used to define projection
based Multilayer Perceptrons (MLPs) with functional inputs. We study in this
paper important theoretical properties of the proposed model. We show in
particular that MLPs with functional inputs are universal approximators: they
can approximate to arbitrary accuracy any continuous mapping from a compact
sub-space of a functional space to R. Moreover, we provide a consistency result
that shows that any mapping from a functional space to R can be learned thanks
to examples by a projection based MLP: the generalization mean square error of
the MLP decreases to the smallest possible mean square error on the data when
the number of examples goes to infinity
Statistical learnability of nuclear masses
After more than 80 years from the seminal work of Weizs\"acker and the liquid
drop model of the atomic nucleus, deviations from experiments of mass models
( MeV) are orders of magnitude larger than experimental errors
( keV). Predicting the mass of atomic nuclei with precision is
extremely challenging. This is due to the non--trivial many--body interplay of
protons and neutrons in nuclei, and the complex nature of the nuclear strong
force. Statistical theory of learning will be used to provide bounds to the
prediction errors of model trained with a finite data set. These bounds are
validated with neural network calculations, and compared with state of the art
mass models. Therefore, it will be argued that the nuclear structure models
investigating ground state properties explore a system on the limit of the
knowledgeable, as defined by the statistical theory of learning
Representation of Functional Data in Neural Networks
Functional Data Analysis (FDA) is an extension of traditional data analysis
to functional data, for example spectra, temporal series, spatio-temporal
images, gesture recognition data, etc. Functional data are rarely known in
practice; usually a regular or irregular sampling is known. For this reason,
some processing is needed in order to benefit from the smooth character of
functional data in the analysis methods. This paper shows how to extend the
Radial-Basis Function Networks (RBFN) and Multi-Layer Perceptron (MLP) models
to functional data inputs, in particular when the latter are known through
lists of input-output pairs. Various possibilities for functional processing
are discussed, including the projection on smooth bases, Functional Principal
Component Analysis, functional centering and reduction, and the use of
differential operators. It is shown how to incorporate these functional
processing into the RBFN and MLP models. The functional approach is illustrated
on a benchmark of spectrometric data analysis.Comment: Also available online from:
http://www.sciencedirect.com/science/journal/0925231
Flood. An open source neural networks C++ library
The multilayer perceptron is an important model of neural network, and
much of the literature in the eld is referred to that model. The multilayer
perceptron has found a wide range of applications, which include function re-
gression, pattern recognition, time series prediction, optimal control, optimal
shape design or inverse problems. All these problems can be formulated as
variational problems. That neural network can learn either from databases
or from mathematical models.
Flood is a comprehensive class library which implements the multilayer
perceptron in the C++ programming language. It has been developed follow-
ing the functional analysis and calculus of variations theories. In this regard,
this software tool can be used for the whole range of applications mentioned
above. Flood also provides a workaround for the solution of function opti-
mization problems
Beyond Convexity: Stochastic Quasi-Convex Optimization
Stochastic convex optimization is a basic and well studied primitive in
machine learning. It is well known that convex and Lipschitz functions can be
minimized efficiently using Stochastic Gradient Descent (SGD). The Normalized
Gradient Descent (NGD) algorithm, is an adaptation of Gradient Descent, which
updates according to the direction of the gradients, rather than the gradients
themselves. In this paper we analyze a stochastic version of NGD and prove its
convergence to a global minimum for a wider class of functions: we require the
functions to be quasi-convex and locally-Lipschitz. Quasi-convexity broadens
the con- cept of unimodality to multidimensions and allows for certain types of
saddle points, which are a known hurdle for first-order optimization methods
such as gradient descent. Locally-Lipschitz functions are only required to be
Lipschitz in a small region around the optimum. This assumption circumvents
gradient explosion, which is another known hurdle for gradient descent
variants. Interestingly, unlike the vanilla SGD algorithm, the stochastic
normalized gradient descent algorithm provably requires a minimal minibatch
size
Parisi Phase in a Neuron
Pattern storage by a single neuron is revisited. Generalizing Parisi's
framework for spin glasses we obtain a variational free energy functional for
the neuron. The solution is demonstrated at high temperature and large relative
number of examples, where several phases are identified by thermodynamical
stability analysis, two of them exhibiting spontaneous full replica symmetry
breaking. We give analytically the curved segments of the order parameter
function and in representative cases compute the free energy, the storage
error, and the entropy.Comment: 4 pages in prl twocolumn format + 3 Postscript figures. Submitted to
Physical Review Letter
The learnability of unknown quantum measurements
© Rinton Press. In this work, we provide an elegant framework to analyze learning matrices in the Schatten class by taking advantage of a recently developed methodology—matrix concentration inequalities. We establish the fat-shattering dimension, Rademacher/Gaussian complexity, and the entropy number of learning bounded operators and trace class operators. By characterising the tasks of learning quantum states and two-outcome quantum measurements into learning matrices in the Schatten-1 and ∞ classes, our proposed approach directly solves the sample complexity problems of learning quantum states and quantum measurements. Our main result in the paper is that, for learning an unknown quantum measurement, the upper bound, given by the fat-shattering dimension, is linearly proportional to the dimension of the underlying Hilbert space. Learning an unknown quantum state becomes a dual problem to ours, and as a byproduct, we can recover Aaronson’s famous result [Proc. R. Soc. A 463, 3089–3144 (2007)] solely using a classical machine learning technique. In addition, other famous complexity measures like covering numbers and Rademacher/Gaussian complexities are derived explicitly under the same framework. We are able to connect measures of sample complexity with various areas in quantum information science, e.g. quantum state/measurement tomography, quantum state discrimination and quantum random access codes, which may be of independent interest. Lastly, with the assistance of general Bloch-sphere representation, we show that learning quantum measurements/states can be mathematically formulated as a neural network. Consequently, classical ML algorithms can be applied to efficiently accomplish the two quantum learning tasks
Methodologies for tracking of load extremes and error estimation using probabilistic techniques
This work, conducted at CIMNE under ALEF project task 1.2.3, presents an investigation
about the potential capabilities of neural networks to assist simulation campaigns. The
discrete gust response of an aircraft has been chosen as a typical problem in which the
determination of the critical loads requires exploring a large parameter space.
A very simple model has been used to compute the aerodynamic loads. This allows creating
a large database while at the same time retaining some of the fundamental properties of the
problem. Using this comprehensive dataset the effects of network structure, training method
and sampling strategy on the level of approximation over the complete domain have been
investigated. The capabilities of the neural network to predict the peak load as well as the
critical values of the design parameters have also been assessed. The applicability of neural
networks to the combination of multi-fidelity results is also explored
Techniques of replica symmetry breaking and the storage problem of the McCulloch-Pitts neuron
In this article the framework for Parisi's spontaneous replica symmetry
breaking is reviewed, and subsequently applied to the example of the
statistical mechanical description of the storage properties of a
McCulloch-Pitts neuron. The technical details are reviewed extensively, with
regard to the wide range of systems where the method may be applied. Parisi's
partial differential equation and related differential equations are discussed,
and a Green function technique introduced for the calculation of replica
averages, the key to determining the averages of physical quantities. The
ensuing graph rules involve only tree graphs, as appropriate for a
mean-field-like model. The lowest order Ward-Takahashi identity is recovered
analytically and is shown to lead to the Goldstone modes in continuous replica
symmetry breaking phases. The need for a replica symmetry breaking theory in
the storage problem of the neuron has arisen due to the thermodynamical
instability of formerly given solutions. Variational forms for the neuron's
free energy are derived in terms of the order parameter function x(q), for
different prior distribution of synapses. Analytically in the high temperature
limit and numerically in generic cases various phases are identified, among
them one similar to the Parisi phase in the Sherrington-Kirkpatrick model.
Extensive quantities like the error per pattern change slightly with respect to
the known unstable solutions, but there is a significant difference in the
distribution of non-extensive quantities like the synaptic overlaps and the
pattern storage stability parameter. A simulation result is also reviewed and
compared to the prediction of the theory.Comment: 103 Latex pages (with REVTeX 3.0), including 15 figures (ps, epsi,
eepic), accepted for Physics Report
Priors Stabilizers and Basis Functions: From Regularization to Radial, Tensor and Additive Splines
We had previously shown that regularization principles lead to approximation schemes, as Radial Basis Functions, which are equivalent to networks with one layer of hidden units, called Regularization Networks. In this paper we show that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models, Breiman's hinge functions and some forms of Projection Pursuit Regression. In the probabilistic interpretation of regularization, the different classes of basis functions correspond to different classes of prior probabilities on the approximating function spaces, and therefore to different types of smoothness assumptions. In the final part of the paper, we also show a relation between activation functions of the Gaussian and sigmoidal type
- …