759 research outputs found

    Learning and generalization theories of large committee--machines

    Full text link
    The study of the distribution of volumes associated to the internal representations of learning examples allows us to derive the critical learning capacity (αc=16πlnK\alpha_c=\frac{16}{\pi} \sqrt{\ln K}) of large committee machines, to verify the stability of the solution in the limit of a large number KK of hidden units and to find a Bayesian generalization cross--over at α=K\alpha=K.Comment: 14 pages, revte

    Analysis of the computational complexity of solving random satisfiability problems using branch and bound search algorithms

    Full text link
    The computational complexity of solving random 3-Satisfiability (3-SAT) problems is investigated. 3-SAT is a representative example of hard computational tasks; it consists in knowing whether a set of alpha N randomly drawn logical constraints involving N Boolean variables can be satisfied altogether or not. Widely used solving procedures, as the Davis-Putnam-Loveland-Logeman (DPLL) algorithm, perform a systematic search for a solution, through a sequence of trials and errors represented by a search tree. In the present study, we identify, using theory and numerical experiments, easy (size of the search tree scaling polynomially with N) and hard (exponential scaling) regimes as a function of the ratio alpha of constraints per variable. The typical complexity is explicitly calculated in the different regimes, in very good agreement with numerical simulations. Our theoretical approach is based on the analysis of the growth of the branches in the search tree under the operation of DPLL. On each branch, the initial 3-SAT problem is dynamically turned into a more generic 2+p-SAT problem, where p and 1-p are the fractions of constraints involving three and two variables respectively. The growth of each branch is monitored by the dynamical evolution of alpha and p and is represented by a trajectory in the static phase diagram of the random 2+p-SAT problem. Depending on whether or not the trajectories cross the boundary between phases, single branches or full trees are generated by DPLL, resulting in easy or hard resolutions.Comment: 37 RevTeX pages, 15 figures; submitted to Phys.Rev.

    Emergence of Compositional Representations in Restricted Boltzmann Machines

    Full text link
    Extracting automatically the complex set of features composing real high-dimensional data is crucial for achieving high performance in machine--learning tasks. Restricted Boltzmann Machines (RBM) are empirically known to be efficient for this purpose, and to be able to generate distributed and graded representations of the data. We characterize the structural conditions (sparsity of the weights, low effective temperature, nonlinearities in the activation functions of hidden units, and adaptation of fields maintaining the activity in the visible layer) allowing RBM to operate in such a compositional phase. Evidence is provided by the replica analysis of an adequate statistical ensemble of random RBMs and by RBM trained on the handwritten digits dataset MNIST.Comment: Supplementary material available at the authors' webpag

    Reconstructing a Random Potential from its Random Walks

    Get PDF
    The problem of how many trajectories of a random walker in a potential are needed to reconstruct the values of this potential is studied. We show that this problem can be solved by calculating the probability of survival of an abstract random walker in a partially absorbing potential. The approach is illustrated on the discrete Sinai (random force) model with a drift. We determine the parameter (temperature, duration of each trajectory, ...) values making reconstruction as fast as possible

    Weight Space Structure and Internal Representations: a Direct Approach to Learning and Generalization in Multilayer Neural Network

    Full text link
    We analytically derive the geometrical structure of the weight space in multilayer neural networks (MLN), in terms of the volumes of couplings associated to the internal representations of the training set. Focusing on the parity and committee machines, we deduce their learning and generalization capabilities both reinterpreting some known properties and finding new exact results. The relationship between our approach and information theory as well as the Mitchison--Durbin calculation is established. Our results are exact in the limit of a large number of hidden units, showing that MLN are a class of exactly solvable models with a simple interpretation of replica symmetry breaking.Comment: 12 pages, 1 compressed ps figure (uufile), RevTeX fil

    Criticality and Universality in the Unit-Propagation Search Rule

    Full text link
    The probability Psuccess(alpha, N) that stochastic greedy algorithms successfully solve the random SATisfiability problem is studied as a function of the ratio alpha of constraints per variable and the number N of variables. These algorithms assign variables according to the unit-propagation (UP) rule in presence of constraints involving a unique variable (1-clauses), to some heuristic (H) prescription otherwise. In the infinite N limit, Psuccess vanishes at some critical ratio alpha\_H which depends on the heuristic H. We show that the critical behaviour is determined by the UP rule only. In the case where only constraints with 2 and 3 variables are present, we give the phase diagram and identify two universality classes: the power law class, where Psuccess[alpha\_H (1+epsilon N^{-1/3}), N] ~ A(epsilon)/N^gamma; the stretched exponential class, where Psuccess[alpha\_H (1+epsilon N^{-1/3}), N] ~ exp[-N^{1/6} Phi(epsilon)]. Which class is selected depends on the characteristic parameters of input data. The critical exponent gamma is universal and calculated; the scaling functions A and Phi weakly depend on the heuristic H and are obtained from the solutions of reaction-diffusion equations for 1-clauses. Computation of some non-universal corrections allows us to match numerical results with good precision. The critical behaviour for constraints with >3 variables is given. Our results are interpreted in terms of dynamical graph percolation and we argue that they should apply to more general situations where UP is used.Comment: 30 pages, 13 figure
    corecore