996 research outputs found
Characterizing and Improving Generalized Belief Propagation Algorithms on the 2D Edwards-Anderson Model
We study the performance of different message passing algorithms in the two
dimensional Edwards Anderson model. We show that the standard Belief
Propagation (BP) algorithm converges only at high temperature to a paramagnetic
solution. Then, we test a Generalized Belief Propagation (GBP) algorithm,
derived from a Cluster Variational Method (CVM) at the plaquette level. We
compare its performance with BP and with other algorithms derived under the
same approximation: Double Loop (DL) and a two-ways message passing algorithm
(HAK). The plaquette-CVM approximation improves BP in at least three ways: the
quality of the paramagnetic solution at high temperatures, a better estimate
(lower) for the critical temperature, and the fact that the GBP message passing
algorithm converges also to non paramagnetic solutions. The lack of convergence
of the standard GBP message passing algorithm at low temperatures seems to be
related to the implementation details and not to the appearance of long range
order. In fact, we prove that a gauge invariance of the constrained CVM free
energy can be exploited to derive a new message passing algorithm which
converges at even lower temperatures. In all its region of convergence this new
algorithm is faster than HAK and DL by some orders of magnitude.Comment: 19 pages, 13 figure
Message passing and Monte Carlo algorithms: connecting fixed points with metastable states
Mean field-like approximations (including naive mean field, Bethe and Kikuchi
and more general Cluster Variational Methods) are known to stabilize ordered
phases at temperatures higher than the thermodynamical transition. For example,
in the Edwards-Anderson model in 2-dimensions these approximations predict a
spin glass transition at finite . Here we show that the spin glass solutions
of the Cluster Variational Method (CVM) at plaquette level do describe well
actual metastable states of the system. Moreover, we prove that these states
can be used to predict non trivial statistical quantities, like the
distribution of the overlap between two replicas. Our results support the idea
that message passing algorithms can be helpful to accelerate Monte Carlo
simulations in finite dimensional systems.Comment: 6 pages, 6 figure
Region graph partition function expansion and approximate free energy landscapes: Theory and some numerical results
Graphical models for finite-dimensional spin glasses and real-world
combinatorial optimization and satisfaction problems usually have an abundant
number of short loops. The cluster variation method and its extension, the
region graph method, are theoretical approaches for treating the complicated
short-loop-induced local correlations. For graphical models represented by
non-redundant or redundant region graphs, approximate free energy landscapes
are constructed in this paper through the mathematical framework of region
graph partition function expansion. Several free energy functionals are
obtained, each of which use a set of probability distribution functions or
functionals as order parameters. These probability distribution
function/functionals are required to satisfy the region graph
belief-propagation equation or the region graph survey-propagation equation to
ensure vanishing correction contributions of region subgraphs with dangling
edges. As a simple application of the general theory, we perform region graph
belief-propagation simulations on the square-lattice ferromagnetic Ising model
and the Edwards-Anderson model. Considerable improvements over the conventional
Bethe-Peierls approximation are achieved. Collective domains of different sizes
in the disordered and frustrated square lattice are identified by the
message-passing procedure. Such collective domains and the frustrations among
them are responsible for the low-temperature glass-like dynamical behaviors of
the system.Comment: 30 pages, 11 figures. More discussion on redundant region graphs. To
be published by Journal of Statistical Physic
Cycle-based Cluster Variational Method for Direct and Inverse Inference
We elaborate on the idea that loop corrections to belief propagation could be
dealt with in a systematic way on pairwise Markov random fields, by using the
elements of a cycle basis to define region in a generalized belief propagation
setting. The region graph is specified in such a way as to avoid dual loops as
much as possible, by discarding redundant Lagrange multipliers, in order to
facilitate the convergence, while avoiding instabilities associated to minimal
factor graph construction. We end up with a two-level algorithm, where a belief
propagation algorithm is run alternatively at the level of each cycle and at
the inter-region level. The inverse problem of finding the couplings of a
Markov random field from empirical covariances can be addressed region wise. It
turns out that this can be done efficiently in particular in the Ising context,
where fixed point equations can be derived along with a one-parameter log
likelihood function to minimize. Numerical experiments confirm the
effectiveness of these considerations both for the direct and inverse MRF
inference.Comment: 47 pages, 16 figure
Belief Propagation and Loop Series on Planar Graphs
We discuss a generic model of Bayesian inference with binary variables
defined on edges of a planar graph. The Loop Calculus approach of [1, 2] is
used to evaluate the resulting series expansion for the partition function. We
show that, for planar graphs, truncating the series at single-connected loops
reduces, via a map reminiscent of the Fisher transformation [3], to evaluating
the partition function of the dimer matching model on an auxiliary planar
graph. Thus, the truncated series can be easily re-summed, using the Pfaffian
formula of Kasteleyn [4]. This allows to identify a big class of
computationally tractable planar models reducible to a dimer model via the
Belief Propagation (gauge) transformation. The Pfaffian representation can also
be extended to the full Loop Series, in which case the expansion becomes a sum
of Pfaffian contributions, each associated with dimer matchings on an extension
to a subgraph of the original graph. Algorithmic consequences of the Pfaffian
representation, as well as relations to quantum and non-planar models, are
discussed.Comment: Accepted for publication in Journal of Statistical Mechanics: theory
and experimen
Statistical physics of constraint satisfaction problems
La technique des répliques est une technique formidable prenant ses origines de la physique statistique, comme un moyen de calculer l'espérance du logarithme de la constante de normalisation d'une distribution de probabilité à haute dimension. Dans le jargon de physique, cette quantité est connue sous le nom de l’énergie libre, et toutes sortes de quantités utiles, telle que l’entropie, peuvent être obtenue de là par des dérivées. Cependant, ceci est un problème NP-difficile, qu’une bonne partie de statistique computationelle essaye de résoudre, et qui apparaît partout; de la théorie des codes, à la statistique en hautes dimensions, en passant par les problèmes de satisfaction de contraintes. Dans chaque cas, la méthode des répliques, et son extension par (Parisi et al., 1987), se sont prouvées fortes utiles pour illuminer quelques aspects concernant la corrélation des variables de la distribution de Gibbs et la nature fortement nonconvexe de son logarithme negatif. Algorithmiquement, il existe deux principales méthodologies adressant la difficulté de calcul que pose la constante de normalisation:
a). Le point de vue statique: dans cette approche, on reformule le problème en tant que graphe dont les nœuds correspondent aux variables individuelles de la distribution de Gibbs, et dont les arêtes reflètent les dépendances entre celles-ci. Quand le graphe en question est localement un arbre, les procédures de message-passing sont garanties d’approximer arbitrairement bien les probabilités marginales de la distribution de Gibbs et de manière équivalente d'approximer la constante de normalisation. Les prédictions de la physique concernant la disparition des corrélations à longues portées se traduise donc, par le fait que le graphe soit localement un arbre, ainsi permettant l’utilisation des algorithmes locaux de passage de messages. Ceci va être le sujet du chapitre 4.
b). Le point de vue dynamique: dans une direction orthogonale, on peut contourner le problème que pose le calcul de la constante de normalisation, en définissant une chaîne de Markov le long de laquelle, l’échantillonnage converge à celui selon la distribution de Gibbs, tel qu’après un certain nombre d’itérations (sous le nom de temps de relaxation), les échantillons sont garanties d’être approximativement générés selon elle. Afin de discuter des conditions dans lesquelles chacune de ces approches échoue, il est très utile d’être familier avec la méthode de replica symmetry breaking de Parisi.
Cependant, les calculs nécessaires sont assez compliqués, et requièrent des notions qui sont typiquemment étrangères à ceux sans un entrainement en physique statistique.
Ce mémoire a principalement deux objectifs : i) de fournir une introduction a la théorie des répliques, ses prédictions, et ses conséquences algorithmiques pour les problèmes de satisfaction de constraintes, et ii) de donner un survol des méthodes les plus récentes adressant la transition de phase, prédite par la méthode des répliques, dans le cas du problème k−SAT, à partir du point de vu statique et dynamique, et finir en proposant un nouvel algorithme qui prend en considération la transition de phase en question.The replica trick is a powerful analytic technique originating from statistical physics as an attempt to compute the expectation of the logarithm of the normalization constant of a high dimensional probability distribution known as the Gibbs measure. In physics jargon this quantity is known as the free energy, and all kinds of useful quantities, such as the entropy, can be obtained from it using simple derivatives. The computation of this normalization constant is however an NP-hard problem that a large part of computational statistics attempts to deal with, and which shows up everywhere from coding theory, to high dimensional statistics, compressed sensing, protein folding analysis and constraint satisfaction problems. In each of these cases, the replica trick, and its extension by (Parisi et al., 1987), have proven incredibly successful at shedding light on keys aspects relating to the correlation structure of the Gibbs measure and the highly non-convex nature of − log(the Gibbs measure()). Algorithmic speaking, there exists two main methodologies addressing the intractability of the normalization constant:
a) Statics: in this approach, one casts the system as a graphical model whose vertices represent individual variables, and whose edges reflect the dependencies between them. When the underlying graph is locally tree-like, local messagepassing procedures are guaranteed to yield near-exact marginal probabilities or equivalently compute Z. The physics predictions of vanishing long range correlation in the Gibbs measure, then translate into the associated graph being locally tree-like, hence permitting the use message passing procedures. This will be the focus of chapter 4.
b) Dynamics: in an orthogonal direction, we can altogether bypass the issue of computing the normalization constant, by defining a Markov chain along which sampling converges to the Gibbs measure, such that after a number of iterations known as the relaxation-time, samples
are guaranteed to be approximately sampled according to the Gibbs measure. To get into the conditions in which each of the two approaches is likely to fail (strong long range correlation, high energy barriers, etc..), it is very helpful to be familiar with the so-called replica symmetry breaking picture of Parisi. The computations involved are however quite involved, and come with a number of prescriptions and prerequisite notions (s.a. large deviation principles, saddle-point approximations) that are typically foreign to those without a statistical physics background. The purpose of this thesis is then twofold: i) to provide a self-contained introduction to replica theory, its predictions, and its algorithmic implications for constraint satisfaction problems, and ii) to give an account of state of the art methods in addressing the predicted phase transitions in the case of k−SAT, from both the statics and dynamics points of view,
and propose a new algorithm takes takes these into consideration
Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning
The paper introduces the application of information geometry to describe the
ground states of Ising models by utilizing parity-check matrices of cyclic and
quasi-cyclic codes on toric and spherical topologies. The approach establishes
a connection between machine learning and error-correcting coding. This
proposed approach has implications for the development of new embedding methods
based on trapping sets. Statistical physics and number geometry applied for
optimize error-correcting codes, leading to these embedding and sparse
factorization methods. The paper establishes a direct connection between DNN
architecture and error-correcting coding by demonstrating how state-of-the-art
architectures (ChordMixer, Mega, Mega-chunk, CDIL, ...) from the long-range
arena can be equivalent to of block and convolutional LDPC codes (Cage-graph,
Repeat Accumulate). QC codes correspond to certain types of chemical elements,
with the carbon element being represented by the mixed automorphism
Shu-Lin-Fossorier QC-LDPC code. The connections between Belief Propagation and
the Permanent, Bethe-Permanent, Nishimori Temperature, and Bethe-Hessian Matrix
are elaborated upon in detail. The Quantum Approximate Optimization Algorithm
(QAOA) used in the Sherrington-Kirkpatrick Ising model can be seen as analogous
to the back-propagation loss function landscape in training DNNs. This
similarity creates a comparable problem with TS pseudo-codeword, resembling the
belief propagation method. Additionally, the layer depth in QAOA correlates to
the number of decoding belief propagation iterations in the Wiberg decoding
tree. Overall, this work has the potential to advance multiple fields, from
Information Theory, DNN architecture design (sparse and structured prior graph
topology), efficient hardware design for Quantum and Classical DPU/TPU (graph,
quantize and shift register architect.) to Materials Science and beyond.Comment: 71 pages, 42 Figures, 1 Table, 1 Appendix. arXiv admin note: text
overlap with arXiv:2109.08184 by other author
- …