Search CORE

246 research outputs found

Dynamics of heuristic optimization algorithms on random graphs

Author: Weigt Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/06/2002
Field of study

In this paper, the dynamics of heuristic algorithms for constructing small vertex covers (or independent sets) of finite-connectivity random graphs is analysed. In every algorithmic step, a vertex is chosen with respect to its vertex degree. This vertex, and some environment of it, is covered and removed from the graph. This graph reduction process can be described as a Markovian dynamics in the space of random graphs of arbitrary degree distribution. We discuss some solvable cases, including algorithms already analysed using different techniques, and develop approximation schemes for more complicated cases. The approximations are corroborated by numerical simulations.Comment: 19 pages, 3 figures, version to app. in EPJ

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Selection of sequence motifs and generative Hopfield-Potts models for protein familiesilies

Author: Shimagaki Kai
Weigt Martin
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2019
Field of study

Statistical models for families of evolutionary related proteins have recently gained interest: in particular pairwise Potts models, as those inferred by the Direct-Coupling Analysis, have been able to extract information about the three-dimensional structure of folded proteins, and about the effect of amino-acid substitutions in proteins. These models are typically requested to reproduce the one- and two-point statistics of the amino-acid usage in a protein family, {\em i.e.}~to capture the so-called residue conservation and covariation statistics of proteins of common evolutionary origin. Pairwise Potts models are the maximum-entropy models achieving this. While being successful, these models depend on huge numbers of {\em ad hoc} introduced parameters, which have to be estimated from finite amount of data and whose biophysical interpretation remains unclear. Here we propose an approach to parameter reduction, which is based on selecting collective sequence motifs. It naturally leads to the formulation of statistical sequence models in terms of Hopfield-Potts models. These models can be accurately inferred using a mapping to restricted Boltzmann machines and persistent contrastive divergence. We show that, when applied to protein data, even 20-40 patterns are sufficient to obtain statistically close-to-generative models. The Hopfield patterns form interpretable sequence motifs and may be used to clusterize amino-acid sequences into functional sub-families. However, the distributed collective nature of these motifs intrinsically limits the ability of Hopfield-Potts models in predicting contact maps, showing the necessity of developing models going beyond the Hopfield-Potts models discussed here.Comment: 26 pages, 16 figures, to app. in PR

arXiv.org e-Print Archive

HAL Descartes

HAL-INSU

Hal-Diderot

Typical solution time for a vertex-covering algorithm on finite-connectivity random graphs

Author: Hartmann Alexander K.
Weigt Martin
Publication venue: 'American Physical Society (APS)'
Publication date: 28/11/2000
Field of study

In this letter, we analytically describe the typical solution time needed by a backtracking algorithm to solve the vertex-cover problem on finite-connectivity random graphs. We find two different transitions: The first one is algorithm-dependent and marks the dynamical transition from linear to exponential solution times. The second one gives the maximum computational complexity, and is found exactly at the threshold where the system undergoes an algorithm-independent phase transition in its solvability. Analytical results are corroborated by numerical simulations.Comment: 4 pages, 2 figures, to appear in Phys. Rev. Let

arXiv.org e-Print Archive

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Threshold values, stability analysis and high-q asymptotics for the coloring problem on random graphs

Author: Krzakala Florent
Pagnani Andrea
Weigt Martin
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2004
Field of study

We consider the problem of coloring Erdos-Renyi and regular random graphs of finite connectivity using q colors. It has been studied so far using the cavity approach within the so-called one-step replica symmetry breaking (1RSB) ansatz. We derive a general criterion for the validity of this ansatz and, applying it to the ground state, we provide evidence that the 1RSB solution gives exact threshold values c_q for the q-COL/UNCOL phase transition. We also study the asymptotic thresholds for q >> 1 finding c_q = 2qlog(q)-log(q)-1+o(1) in perfect agreement with rigorous mathematical bounds, as well as the nature of excited states, and give a global phase diagram of the problem.Comment: 23 pages, 10 figures. Replaced with accepted versio

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

A variational description of the ground state structure in random satisfiability problems

Author: Biroli Giulio
Monasson Remi
Weigt Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/11/1999
Field of study

A variational approach to finite connectivity spin-glass-like models is developed and applied to describe the structure of optimal solutions in random satisfiability problems. Our variational scheme accurately reproduces the known replica symmetric results and also allows for the inclusion of replica symmetry breaking effects. For the 3-SAT problem, we find two transitions as the ratio

\alpha

of logical clauses per Boolean variables increases. At the first one

\alpha_s \simeq 3.96

, a non-trivial organization of the solution space in geometrically separated clusters emerges. The multiplicity of these clusters as well as the typical distances between different solutions are calculated. At the second threshold

\alpha_c \simeq 4.48

, satisfying assignments disappear and a finite fraction

B_0 \simeq 0.13

of variables are overconstrained and take the same values in all optimal (though unsatisfying) assignments. These values have to be compared to

\alpha_c \simeq 4.27, B_0 \simeq 0.4

obtained from numerical experiments on small instances. Within the present variational approach, the SAT-UNSAT transition naturally appears as a mixture of a first and a second order transition. For the mixed

2+p

-SAT with

p<2/5

, the behavior is as expected much simpler: a unique smooth transition from SAT to UNSAT takes place at

\alpha_c=1/(1-p)

.Comment: 24 pages, 6 eps figures, to be published in Europ. Phys. J.

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

From principal component to direct coupling analysis of coevolution in proteins: Low-eigenvalue modes are needed for structure prediction

Author: Cocco Simona
Monasson Remi
Weigt Martin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 22/08/2013
Field of study

Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant 'patterns' of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold.Comment: Supporting information can be downloaded from: http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.100317

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

Hal-Diderot

The Francis Crick Institute

Towards finite-dimensional gelation

Author: Broderix Kurt
Weigt Martin
Zippelius Annette
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/08/2002
Field of study

We consider the gelation of particles which are permanently connected by random crosslinks, drawn from an ensemble of finite-dimensional continuum percolation. To average over the randomness, we apply the replica trick, and interpret the replicated and crosslink-averaged model as an effective molecular fluid. A Mayer-cluster expansion for moments of the local static density fluctuations is set up. The simplest non-trivial contribution to this series leads back to mean-field theory. The central quantity of mean-field theory is the distribution of localization lengths, which we compute for all connectivities. The highly crosslinked gel is characterized by a one-to-one correspondence of connectivity and localization length. Taking into account higher contributions in the Mayer-cluster expansion, systematic corrections to mean-field can be included. The sol-gel transition shifts to a higher number of crosslinks per particle, as more compact structures are favored. The critical behavior of the model remains unchanged as long as finite truncations of the cluster expansion are considered. To complete the picture, we also discuss various geometrical properties of the crosslink network, e.g. connectivity correlations, and relate the studied crosslink ensemble to a wider class of ensembles, including the Deam-Edwards distribution.Comment: 18 pages, 4 figures, version to be published in EPJ

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)