1,728 research outputs found

    Efficiency of quantum versus classical annealing in non-convex learning problems

    Full text link
    Quantum annealers aim at solving non-convex optimization problems by exploiting cooperative tunneling effects to escape local minima. The underlying idea consists in designing a classical energy function whose ground states are the sought optimal solutions of the original optimization problem and add a controllable quantum transverse field to generate tunneling processes. A key challenge is to identify classes of non-convex optimization problems for which quantum annealing remains efficient while thermal annealing fails. We show that this happens for a wide class of problems which are central to machine learning. Their energy landscapes is dominated by local minima that cause exponential slow down of classical thermal annealers while simulated quantum annealing converges efficiently to rare dense regions of optimal solutions.Comment: 31 pages, 10 figure

    The Entropy of the K-Satisfiability Problem

    Full text link
    The threshold behaviour of the K-Satisfiability problem is studied in the framework of the statistical mechanics of random diluted systems. We find that at the transition the entropy is finite and hence that the transition itself is due to the abrupt appearance of logical contradictions in all solutions and not to the progressive decreasing of the number of these solutions down to zero. A physical interpretation is given for the different cases K=1K=1, K=2K=2 and Kā‰„3K \geq 3.Comment: revtex, 11 pages + 1 figur

    Sign problem in the Bethe approximation

    Full text link
    We propose a message-passing algorithm to compute the Hamiltonian expectation with respect to an appropriate class of trial wave functions for an interacting system of fermions. To this end, we connect the quantum expectations to average quantities in a classical system with both local and global interactions, which are related to the variational parameters and use the Bethe approximation to estimate the average energy within the replica-symmetric approximation. The global interactions, which are needed to obtain a good estimation of the average fermion sign, make the average energy a nonlocal function of the variational parameters. We use some heuristic minimization algorithms to find approximate ground states of the Hubbard model on random regular graphs and observe significant qualitative improvements with respect to the mean-field approximation.Comment: 19 pages, 9 figures, one figure adde

    Learning and generalization theories of large committee--machines

    Full text link
    The study of the distribution of volumes associated to the internal representations of learning examples allows us to derive the critical learning capacity (Ī±c=16Ļ€lnā”K\alpha_c=\frac{16}{\pi} \sqrt{\ln K}) of large committee machines, to verify the stability of the solution in the limit of a large number KK of hidden units and to find a Bayesian generalization cross--over at Ī±=K\alpha=K.Comment: 14 pages, revte

    Weight Space Structure and Internal Representations: a Direct Approach to Learning and Generalization in Multilayer Neural Network

    Full text link
    We analytically derive the geometrical structure of the weight space in multilayer neural networks (MLN), in terms of the volumes of couplings associated to the internal representations of the training set. Focusing on the parity and committee machines, we deduce their learning and generalization capabilities both reinterpreting some known properties and finding new exact results. The relationship between our approach and information theory as well as the Mitchison--Durbin calculation is established. Our results are exact in the limit of a large number of hidden units, showing that MLN are a class of exactly solvable models with a simple interpretation of replica symmetry breaking.Comment: 12 pages, 1 compressed ps figure (uufile), RevTeX fil

    On the performance of a cavity method based algorithm for the Prize-Collecting Steiner Tree Problem on graphs

    Get PDF
    We study the behavior of an algorithm derived from the cavity method for the Prize-Collecting Steiner Tree (PCST) problem on graphs. The algorithm is based on the zero temperature limit of the cavity equations and as such is formally simple (a fixed point equation resolved by iteration) and distributed (parallelizable). We provide a detailed comparison with state-of-the-art algorithms on a wide range of existing benchmarks networks and random graphs. Specifically, we consider an enhanced derivative of the Goemans-Williamson heuristics and the DHEA solver, a Branch and Cut Linear/Integer Programming based approach. The comparison shows that the cavity algorithm outperforms the two algorithms in most large instances both in running time and quality of the solution. Finally we prove a few optimality properties of the solutions provided by our algorithm, including optimality under the two post-processing procedures defined in the Goemans-Williamson derivative and global optimality in some limit cases

    Efficient LDPC Codes over GF(q) for Lossy Data Compression

    Full text link
    In this paper we consider the lossy compression of a binary symmetric source. We present a scheme that provides a low complexity lossy compressor with near optimal empirical performance. The proposed scheme is based on b-reduced ultra-sparse LDPC codes over GF(q). Encoding is performed by the Reinforced Belief Propagation algorithm, a variant of Belief Propagation. The computational complexity at the encoder is O(.n.q.log q), where is the average degree of the check nodes. For our code ensemble, decoding can be performed iteratively following the inverse steps of the leaf removal algorithm. For a sparse parity-check matrix the number of needed operations is O(n).Comment: 5 pages, 3 figure

    Message passing algorithms for non-linear nodes and data compression

    Full text link
    The use of parity-check gates in information theory has proved to be very efficient. In particular, error correcting codes based on parity checks over low-density graphs show excellent performances. Another basic issue of information theory, namely data compression, can be addressed in a similar way by a kind of dual approach. The theoretical performance of such a Parity Source Coder can attain the optimal limit predicted by the general rate-distortion theory. However, in order to turn this approach into an efficient compression code (with fast encoding/decoding algorithms) one must depart from parity checks and use some general random gates. By taking advantage of analytical approaches from the statistical physics of disordered systems and SP-like message passing algorithms, we construct a compressor based on low-density non-linear gates with a very good theoretical and practical performance.Comment: 13 pages, European Conference on Complex Systems, Paris (Nov 2005

    Shaping the learning landscape in neural networks around wide flat minima

    Full text link
    Learning in Deep Neural Networks (DNN) takes place by minimizing a non-convex high-dimensional loss function, typically by a stochastic gradient descent (SGD) strategy. The learning process is observed to be able to find good minimizers without getting stuck in local critical points, and that such minimizers are often satisfactory at avoiding overfitting. How these two features can be kept under control in nonlinear devices composed of millions of tunable connections is a profound and far reaching open question. In this paper we study basic non-convex one- and two-layer neural network models which learn random patterns, and derive a number of basic geometrical and algorithmic features which suggest some answers. We first show that the error loss function presents few extremely wide flat minima (WFM) which coexist with narrower minima and critical points. We then show that the minimizers of the cross-entropy loss function overlap with the WFM of the error loss. We also show examples of learning devices for which WFM do not exist. From the algorithmic perspective we derive entropy driven greedy and message passing algorithms which focus their search on wide flat regions of minimizers. In the case of SGD and cross-entropy loss, we show that a slow reduction of the norm of the weights along the learning process also leads to WFM. We corroborate the results by a numerical study of the correlations between the volumes of the minimizers, their Hessian and their generalization performance on real data.Comment: 37 pages (16 main text), 10 figures (7 main text

    A rigorous analysis of the cavity equations for the minimum spanning tree

    Full text link
    We analyze a new general representation for the Minimum Weight Steiner Tree (MST) problem which translates the topological connectivity constraint into a set of local conditions which can be analyzed by the so called cavity equations techniques. For the limit case of the Spanning tree we prove that the fixed point of the algorithm arising from the cavity equations leads to the global optimum.Comment: 5 pages, 1 figur
    • ā€¦
    corecore