487 research outputs found

    Theoretical Properties of Projection Based Multilayer Perceptrons with Functional Inputs

    Get PDF
    Many real world data are sampled functions. As shown by Functional Data Analysis (FDA) methods, spectra, time series, images, gesture recognition data, etc. can be processed more efficiently if their functional nature is taken into account during the data analysis process. This is done by extending standard data analysis methods so that they can apply to functional inputs. A general way to achieve this goal is to compute projections of the functional data onto a finite dimensional sub-space of the functional space. The coordinates of the data on a basis of this sub-space provide standard vector representations of the functions. The obtained vectors can be processed by any standard method. In our previous work, this general approach has been used to define projection based Multilayer Perceptrons (MLPs) with functional inputs. We study in this paper important theoretical properties of the proposed model. We show in particular that MLPs with functional inputs are universal approximators: they can approximate to arbitrary accuracy any continuous mapping from a compact sub-space of a functional space to R. Moreover, we provide a consistency result that shows that any mapping from a functional space to R can be learned thanks to examples by a projection based MLP: the generalization mean square error of the MLP decreases to the smallest possible mean square error on the data when the number of examples goes to infinity

    Statistical learnability of nuclear masses

    Full text link
    After more than 80 years from the seminal work of Weizs\"acker and the liquid drop model of the atomic nucleus, deviations from experiments of mass models (∼\sim MeV) are orders of magnitude larger than experimental errors (≲\lesssim keV). Predicting the mass of atomic nuclei with precision is extremely challenging. This is due to the non--trivial many--body interplay of protons and neutrons in nuclei, and the complex nature of the nuclear strong force. Statistical theory of learning will be used to provide bounds to the prediction errors of model trained with a finite data set. These bounds are validated with neural network calculations, and compared with state of the art mass models. Therefore, it will be argued that the nuclear structure models investigating ground state properties explore a system on the limit of the knowledgeable, as defined by the statistical theory of learning

    Representation of Functional Data in Neural Networks

    Get PDF
    Functional Data Analysis (FDA) is an extension of traditional data analysis to functional data, for example spectra, temporal series, spatio-temporal images, gesture recognition data, etc. Functional data are rarely known in practice; usually a regular or irregular sampling is known. For this reason, some processing is needed in order to benefit from the smooth character of functional data in the analysis methods. This paper shows how to extend the Radial-Basis Function Networks (RBFN) and Multi-Layer Perceptron (MLP) models to functional data inputs, in particular when the latter are known through lists of input-output pairs. Various possibilities for functional processing are discussed, including the projection on smooth bases, Functional Principal Component Analysis, functional centering and reduction, and the use of differential operators. It is shown how to incorporate these functional processing into the RBFN and MLP models. The functional approach is illustrated on a benchmark of spectrometric data analysis.Comment: Also available online from: http://www.sciencedirect.com/science/journal/0925231

    Flood. An open source neural networks C++ library

    Get PDF
    The multilayer perceptron is an important model of neural network, and much of the literature in the eld is referred to that model. The multilayer perceptron has found a wide range of applications, which include function re- gression, pattern recognition, time series prediction, optimal control, optimal shape design or inverse problems. All these problems can be formulated as variational problems. That neural network can learn either from databases or from mathematical models. Flood is a comprehensive class library which implements the multilayer perceptron in the C++ programming language. It has been developed follow- ing the functional analysis and calculus of variations theories. In this regard, this software tool can be used for the whole range of applications mentioned above. Flood also provides a workaround for the solution of function opti- mization problems

    Beyond Convexity: Stochastic Quasi-Convex Optimization

    Full text link
    Stochastic convex optimization is a basic and well studied primitive in machine learning. It is well known that convex and Lipschitz functions can be minimized efficiently using Stochastic Gradient Descent (SGD). The Normalized Gradient Descent (NGD) algorithm, is an adaptation of Gradient Descent, which updates according to the direction of the gradients, rather than the gradients themselves. In this paper we analyze a stochastic version of NGD and prove its convergence to a global minimum for a wider class of functions: we require the functions to be quasi-convex and locally-Lipschitz. Quasi-convexity broadens the con- cept of unimodality to multidimensions and allows for certain types of saddle points, which are a known hurdle for first-order optimization methods such as gradient descent. Locally-Lipschitz functions are only required to be Lipschitz in a small region around the optimum. This assumption circumvents gradient explosion, which is another known hurdle for gradient descent variants. Interestingly, unlike the vanilla SGD algorithm, the stochastic normalized gradient descent algorithm provably requires a minimal minibatch size

    Parisi Phase in a Neuron

    Full text link
    Pattern storage by a single neuron is revisited. Generalizing Parisi's framework for spin glasses we obtain a variational free energy functional for the neuron. The solution is demonstrated at high temperature and large relative number of examples, where several phases are identified by thermodynamical stability analysis, two of them exhibiting spontaneous full replica symmetry breaking. We give analytically the curved segments of the order parameter function and in representative cases compute the free energy, the storage error, and the entropy.Comment: 4 pages in prl twocolumn format + 3 Postscript figures. Submitted to Physical Review Letter

    The learnability of unknown quantum measurements

    Full text link
    © Rinton Press. In this work, we provide an elegant framework to analyze learning matrices in the Schatten class by taking advantage of a recently developed methodology—matrix concentration inequalities. We establish the fat-shattering dimension, Rademacher/Gaussian complexity, and the entropy number of learning bounded operators and trace class operators. By characterising the tasks of learning quantum states and two-outcome quantum measurements into learning matrices in the Schatten-1 and ∞ classes, our proposed approach directly solves the sample complexity problems of learning quantum states and quantum measurements. Our main result in the paper is that, for learning an unknown quantum measurement, the upper bound, given by the fat-shattering dimension, is linearly proportional to the dimension of the underlying Hilbert space. Learning an unknown quantum state becomes a dual problem to ours, and as a byproduct, we can recover Aaronson’s famous result [Proc. R. Soc. A 463, 3089–3144 (2007)] solely using a classical machine learning technique. In addition, other famous complexity measures like covering numbers and Rademacher/Gaussian complexities are derived explicitly under the same framework. We are able to connect measures of sample complexity with various areas in quantum information science, e.g. quantum state/measurement tomography, quantum state discrimination and quantum random access codes, which may be of independent interest. Lastly, with the assistance of general Bloch-sphere representation, we show that learning quantum measurements/states can be mathematically formulated as a neural network. Consequently, classical ML algorithms can be applied to efficiently accomplish the two quantum learning tasks

    Methodologies for tracking of load extremes and error estimation using probabilistic techniques

    Get PDF
    This work, conducted at CIMNE under ALEF project task 1.2.3, presents an investigation about the potential capabilities of neural networks to assist simulation campaigns. The discrete gust response of an aircraft has been chosen as a typical problem in which the determination of the critical loads requires exploring a large parameter space. A very simple model has been used to compute the aerodynamic loads. This allows creating a large database while at the same time retaining some of the fundamental properties of the problem. Using this comprehensive dataset the effects of network structure, training method and sampling strategy on the level of approximation over the complete domain have been investigated. The capabilities of the neural network to predict the peak load as well as the critical values of the design parameters have also been assessed. The applicability of neural networks to the combination of multi-fidelity results is also explored

    Techniques of replica symmetry breaking and the storage problem of the McCulloch-Pitts neuron

    Full text link
    In this article the framework for Parisi's spontaneous replica symmetry breaking is reviewed, and subsequently applied to the example of the statistical mechanical description of the storage properties of a McCulloch-Pitts neuron. The technical details are reviewed extensively, with regard to the wide range of systems where the method may be applied. Parisi's partial differential equation and related differential equations are discussed, and a Green function technique introduced for the calculation of replica averages, the key to determining the averages of physical quantities. The ensuing graph rules involve only tree graphs, as appropriate for a mean-field-like model. The lowest order Ward-Takahashi identity is recovered analytically and is shown to lead to the Goldstone modes in continuous replica symmetry breaking phases. The need for a replica symmetry breaking theory in the storage problem of the neuron has arisen due to the thermodynamical instability of formerly given solutions. Variational forms for the neuron's free energy are derived in terms of the order parameter function x(q), for different prior distribution of synapses. Analytically in the high temperature limit and numerically in generic cases various phases are identified, among them one similar to the Parisi phase in the Sherrington-Kirkpatrick model. Extensive quantities like the error per pattern change slightly with respect to the known unstable solutions, but there is a significant difference in the distribution of non-extensive quantities like the synaptic overlaps and the pattern storage stability parameter. A simulation result is also reviewed and compared to the prediction of the theory.Comment: 103 Latex pages (with REVTeX 3.0), including 15 figures (ps, epsi, eepic), accepted for Physics Report

    Priors Stabilizers and Basis Functions: From Regularization to Radial, Tensor and Additive Splines

    Get PDF
    We had previously shown that regularization principles lead to approximation schemes, as Radial Basis Functions, which are equivalent to networks with one layer of hidden units, called Regularization Networks. In this paper we show that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models, Breiman's hinge functions and some forms of Projection Pursuit Regression. In the probabilistic interpretation of regularization, the different classes of basis functions correspond to different classes of prior probabilities on the approximating function spaces, and therefore to different types of smoothness assumptions. In the final part of the paper, we also show a relation between activation functions of the Gaussian and sigmoidal type
    • …
    corecore