1,728 research outputs found
The Entropy of the K-Satisfiability Problem
The threshold behaviour of the K-Satisfiability problem is studied in the
framework of the statistical mechanics of random diluted systems. We find that
at the transition the entropy is finite and hence that the transition itself is
due to the abrupt appearance of logical contradictions in all solutions and not
to the progressive decreasing of the number of these solutions down to zero. A
physical interpretation is given for the different cases , and .Comment: revtex, 11 pages + 1 figur
Sign problem in the Bethe approximation
We propose a message-passing algorithm to compute the Hamiltonian expectation
with respect to an appropriate class of trial wave functions for an interacting
system of fermions. To this end, we connect the quantum expectations to average
quantities in a classical system with both local and global interactions, which
are related to the variational parameters and use the Bethe approximation to
estimate the average energy within the replica-symmetric approximation. The
global interactions, which are needed to obtain a good estimation of the
average fermion sign, make the average energy a nonlocal function of the
variational parameters. We use some heuristic minimization algorithms to find
approximate ground states of the Hubbard model on random regular graphs and
observe significant qualitative improvements with respect to the mean-field
approximation.Comment: 19 pages, 9 figures, one figure adde
Efficiency of quantum versus classical annealing in non-convex learning problems
Quantum annealers aim at solving non-convex optimization problems by
exploiting cooperative tunneling effects to escape local minima. The underlying
idea consists in designing a classical energy function whose ground states are
the sought optimal solutions of the original optimization problem and add a
controllable quantum transverse field to generate tunneling processes. A key
challenge is to identify classes of non-convex optimization problems for which
quantum annealing remains efficient while thermal annealing fails. We show that
this happens for a wide class of problems which are central to machine
learning. Their energy landscapes is dominated by local minima that cause
exponential slow down of classical thermal annealers while simulated quantum
annealing converges efficiently to rare dense regions of optimal solutions.Comment: 31 pages, 10 figure
Learning and generalization theories of large committee--machines
The study of the distribution of volumes associated to the internal
representations of learning examples allows us to derive the critical learning
capacity () of large committee machines,
to verify the stability of the solution in the limit of a large number of
hidden units and to find a Bayesian generalization cross--over at .Comment: 14 pages, revte
Weight Space Structure and Internal Representations: a Direct Approach to Learning and Generalization in Multilayer Neural Network
We analytically derive the geometrical structure of the weight space in
multilayer neural networks (MLN), in terms of the volumes of couplings
associated to the internal representations of the training set. Focusing on the
parity and committee machines, we deduce their learning and generalization
capabilities both reinterpreting some known properties and finding new exact
results. The relationship between our approach and information theory as well
as the Mitchison--Durbin calculation is established. Our results are exact in
the limit of a large number of hidden units, showing that MLN are a class of
exactly solvable models with a simple interpretation of replica symmetry
breaking.Comment: 12 pages, 1 compressed ps figure (uufile), RevTeX fil
On the performance of a cavity method based algorithm for the Prize-Collecting Steiner Tree Problem on graphs
We study the behavior of an algorithm derived from the cavity method for the
Prize-Collecting Steiner Tree (PCST) problem on graphs. The algorithm is based
on the zero temperature limit of the cavity equations and as such is formally
simple (a fixed point equation resolved by iteration) and distributed
(parallelizable). We provide a detailed comparison with state-of-the-art
algorithms on a wide range of existing benchmarks networks and random graphs.
Specifically, we consider an enhanced derivative of the Goemans-Williamson
heuristics and the DHEA solver, a Branch and Cut Linear/Integer Programming
based approach. The comparison shows that the cavity algorithm outperforms the
two algorithms in most large instances both in running time and quality of the
solution. Finally we prove a few optimality properties of the solutions
provided by our algorithm, including optimality under the two post-processing
procedures defined in the Goemans-Williamson derivative and global optimality
in some limit cases
Efficient LDPC Codes over GF(q) for Lossy Data Compression
In this paper we consider the lossy compression of a binary symmetric source.
We present a scheme that provides a low complexity lossy compressor with near
optimal empirical performance. The proposed scheme is based on b-reduced
ultra-sparse LDPC codes over GF(q). Encoding is performed by the Reinforced
Belief Propagation algorithm, a variant of Belief Propagation. The
computational complexity at the encoder is O(.n.q.log q), where is the
average degree of the check nodes. For our code ensemble, decoding can be
performed iteratively following the inverse steps of the leaf removal
algorithm. For a sparse parity-check matrix the number of needed operations is
O(n).Comment: 5 pages, 3 figure
Message passing algorithms for non-linear nodes and data compression
The use of parity-check gates in information theory has proved to be very
efficient. In particular, error correcting codes based on parity checks over
low-density graphs show excellent performances. Another basic issue of
information theory, namely data compression, can be addressed in a similar way
by a kind of dual approach. The theoretical performance of such a Parity Source
Coder can attain the optimal limit predicted by the general rate-distortion
theory. However, in order to turn this approach into an efficient compression
code (with fast encoding/decoding algorithms) one must depart from parity
checks and use some general random gates. By taking advantage of analytical
approaches from the statistical physics of disordered systems and SP-like
message passing algorithms, we construct a compressor based on low-density
non-linear gates with a very good theoretical and practical performance.Comment: 13 pages, European Conference on Complex Systems, Paris (Nov 2005
Shaping the learning landscape in neural networks around wide flat minima
Learning in Deep Neural Networks (DNN) takes place by minimizing a non-convex
high-dimensional loss function, typically by a stochastic gradient descent
(SGD) strategy. The learning process is observed to be able to find good
minimizers without getting stuck in local critical points, and that such
minimizers are often satisfactory at avoiding overfitting. How these two
features can be kept under control in nonlinear devices composed of millions of
tunable connections is a profound and far reaching open question. In this paper
we study basic non-convex one- and two-layer neural network models which learn
random patterns, and derive a number of basic geometrical and algorithmic
features which suggest some answers. We first show that the error loss function
presents few extremely wide flat minima (WFM) which coexist with narrower
minima and critical points. We then show that the minimizers of the
cross-entropy loss function overlap with the WFM of the error loss. We also
show examples of learning devices for which WFM do not exist. From the
algorithmic perspective we derive entropy driven greedy and message passing
algorithms which focus their search on wide flat regions of minimizers. In the
case of SGD and cross-entropy loss, we show that a slow reduction of the norm
of the weights along the learning process also leads to WFM. We corroborate the
results by a numerical study of the correlations between the volumes of the
minimizers, their Hessian and their generalization performance on real data.Comment: 37 pages (16 main text), 10 figures (7 main text
A rigorous analysis of the cavity equations for the minimum spanning tree
We analyze a new general representation for the Minimum Weight Steiner Tree
(MST) problem which translates the topological connectivity constraint into a
set of local conditions which can be analyzed by the so called cavity equations
techniques. For the limit case of the Spanning tree we prove that the fixed
point of the algorithm arising from the cavity equations leads to the global
optimum.Comment: 5 pages, 1 figur
- ā¦