1,193 research outputs found

    On-Line Learning Theory of Soft Committee Machines with Correlated Hidden Units - Steepest Gradient Descent and Natural Gradient Descent -

    Full text link
    The permutation symmetry of the hidden units in multilayer perceptrons causes the saddle structure and plateaus of the learning dynamics in gradient learning methods. The correlation of the weight vectors of hidden units in a teacher network is thought to affect this saddle structure, resulting in a prolonged learning time, but this mechanism is still unclear. In this paper, we discuss it with regard to soft committee machines and on-line learning using statistical mechanics. Conventional gradient descent needs more time to break the symmetry as the correlation of the teacher weight vectors rises. On the other hand, no plateaus occur with natural gradient descent regardless of the correlation for the limit of a low learning rate. Analytical results support these dynamics around the saddle point.Comment: 7 pages, 6 figure

    A Neural Network model with Bidirectional Whitening

    Full text link
    We present here a new model and algorithm which performs an efficient Natural gradient descent for Multilayer Perceptrons. Natural gradient descent was originally proposed from a point of view of information geometry, and it performs the steepest descent updates on manifolds in a Riemannian space. In particular, we extend an approach taken by the "Whitened neural networks" model. We make the whitening process not only in feed-forward direction as in the original model, but also in the back-propagation phase. Its efficacy is shown by an application of this "Bidirectional whitened neural networks" model to a handwritten character recognition data (MNIST data).Comment: 16page

    Fluctuation Theorems on Nishimori Line

    Full text link
    The distribution of the performed work for spin glasses with gauge symmetry is considered. With the aid of the gauge symmetry, which leads to the exact/rigorous results in spin glasses, we find a fascinating relation of the performed work as the fluctuation theorem. The integral form of the resultant relation reproduces the Jarzynski-type equation for spin glasses we have obtained. We show that similar relations can be established not only for the distribution of the performed work but also that of the free energy of spin glasses with gauge symmetry, which provides another interpretation of the phase transition in spin glasses.Comment: 10 pages, and 1 figur

    Parametric Fokker-Planck equation

    Full text link
    We derive the Fokker-Planck equation on the parametric space. It is the Wasserstein gradient flow of relative entropy on the statistical manifold. We pull back the PDE to a finite dimensional ODE on parameter space. Some analytical example and numerical examples are presented

    Bifurcation analysis in an associative memory model

    Full text link
    We previously reported the chaos induced by the frustration of interaction in a non-monotonic sequential associative memory model, and showed the chaotic behaviors at absolute zero. We have now analyzed bifurcation in a stochastic system, namely a finite-temperature model of the non-monotonic sequential associative memory model. We derived order-parameter equations from the stochastic microscopic equations. Two-parameter bifurcation diagrams obtained from those equations show the coexistence of attractors, which do not appear at absolute zero, and the disappearance of chaos due to the temperature effect.Comment: 19 page

    Field Theoretical Analysis of On-line Learning of Probability Distributions

    Full text link
    On-line learning of probability distributions is analyzed from the field theoretical point of view. We can obtain an optimal on-line learning algorithm, since renormalization group enables us to control the number of degrees of freedom of a system according to the number of examples. We do not learn parameters of a model, but probability distributions themselves. Therefore, the algorithm requires no a priori knowledge of a model.Comment: 4 pages, 1 figure, RevTe

    The core structure of presolar graphite onions

    Get PDF
    Of the ``presolar particles'' extracted from carbonaceous chondrite dissolution residues, i.e. of those particles which show isotopic evidence of solidification in the neighborhood of other stars prior to the origin of our solar system, one subset has an interesting concentric graphite-rim/graphene-core structure. We show here that single graphene sheet defects in the onion cores (e.g. cyclopentane loops) may be observable edge-on by HREM. This could allow a closer look at models for their formation, and in particular strengthen the possibility that growth of these assemblages proceeds atom-by-atom with the aid of such in-plane defects, under conditions of growth (e.g. radiation fluxes or grain temperature) which discourage the graphite layering that dominates subsequent formation of the rim.Comment: 4 pages, 7 figures, 11 refs, see also http://www.umsl.edu/~fraundor/isocore.htm

    Nonparametric Information Geometry

    Full text link
    The differential-geometric structure of the set of positive densities on a given measure space has raised the interest of many mathematicians after the discovery by C.R. Rao of the geometric meaning of the Fisher information. Most of the research is focused on parametric statistical models. In series of papers by author and coworkers a particular version of the nonparametric case has been discussed. It consists of a minimalistic structure modeled according the theory of exponential families: given a reference density other densities are represented by the centered log likelihood which is an element of an Orlicz space. This mappings give a system of charts of a Banach manifold. It has been observed that, while the construction is natural, the practical applicability is limited by the technical difficulty to deal with such a class of Banach spaces. It has been suggested recently to replace the exponential function with other functions with similar behavior but polynomial growth at infinity in order to obtain more tractable Banach spaces, e.g. Hilbert spaces. We give first a review of our theory with special emphasis on the specific issues of the infinite dimensional setting. In a second part we discuss two specific topics, differential equations and the metric connection. The position of this line of research with respect to other approaches is briefly discussed.Comment: Submitted for publication in the Proceedings od GSI2013 Aug 28-30 2013 Pari

    The volume of Gaussian states by information geometry

    Get PDF
    We formulate the problem of determining the volume of the set of Gaussian physical states in the framework of information geometry. That is, by considering phase space probability distributions parametrized by the covariances and supplying this resulting statistical manifold with the Fisher-Rao metric. We then evaluate the volume of classical, quantum and quantum entangled states for two-mode systems showing chains of strict inclusion
    • …
    corecore