246 research outputs found
Dynamics of heuristic optimization algorithms on random graphs
In this paper, the dynamics of heuristic algorithms for constructing small
vertex covers (or independent sets) of finite-connectivity random graphs is
analysed. In every algorithmic step, a vertex is chosen with respect to its
vertex degree. This vertex, and some environment of it, is covered and removed
from the graph. This graph reduction process can be described as a Markovian
dynamics in the space of random graphs of arbitrary degree distribution. We
discuss some solvable cases, including algorithms already analysed using
different techniques, and develop approximation schemes for more complicated
cases. The approximations are corroborated by numerical simulations.Comment: 19 pages, 3 figures, version to app. in EPJ
Selection of sequence motifs and generative Hopfield-Potts models for protein familiesilies
Statistical models for families of evolutionary related proteins have
recently gained interest: in particular pairwise Potts models, as those
inferred by the Direct-Coupling Analysis, have been able to extract information
about the three-dimensional structure of folded proteins, and about the effect
of amino-acid substitutions in proteins. These models are typically requested
to reproduce the one- and two-point statistics of the amino-acid usage in a
protein family, {\em i.e.}~to capture the so-called residue conservation and
covariation statistics of proteins of common evolutionary origin. Pairwise
Potts models are the maximum-entropy models achieving this. While being
successful, these models depend on huge numbers of {\em ad hoc} introduced
parameters, which have to be estimated from finite amount of data and whose
biophysical interpretation remains unclear. Here we propose an approach to
parameter reduction, which is based on selecting collective sequence motifs. It
naturally leads to the formulation of statistical sequence models in terms of
Hopfield-Potts models. These models can be accurately inferred using a mapping
to restricted Boltzmann machines and persistent contrastive divergence. We show
that, when applied to protein data, even 20-40 patterns are sufficient to
obtain statistically close-to-generative models. The Hopfield patterns form
interpretable sequence motifs and may be used to clusterize amino-acid
sequences into functional sub-families. However, the distributed collective
nature of these motifs intrinsically limits the ability of Hopfield-Potts
models in predicting contact maps, showing the necessity of developing models
going beyond the Hopfield-Potts models discussed here.Comment: 26 pages, 16 figures, to app. in PR
Typical solution time for a vertex-covering algorithm on finite-connectivity random graphs
In this letter, we analytically describe the typical solution time needed by
a backtracking algorithm to solve the vertex-cover problem on
finite-connectivity random graphs. We find two different transitions: The first
one is algorithm-dependent and marks the dynamical transition from linear to
exponential solution times. The second one gives the maximum computational
complexity, and is found exactly at the threshold where the system undergoes an
algorithm-independent phase transition in its solvability. Analytical results
are corroborated by numerical simulations.Comment: 4 pages, 2 figures, to appear in Phys. Rev. Let
Threshold values, stability analysis and high-q asymptotics for the coloring problem on random graphs
We consider the problem of coloring Erdos-Renyi and regular random graphs of
finite connectivity using q colors. It has been studied so far using the cavity
approach within the so-called one-step replica symmetry breaking (1RSB) ansatz.
We derive a general criterion for the validity of this ansatz and, applying it
to the ground state, we provide evidence that the 1RSB solution gives exact
threshold values c_q for the q-COL/UNCOL phase transition. We also study the
asymptotic thresholds for q >> 1 finding c_q = 2qlog(q)-log(q)-1+o(1) in
perfect agreement with rigorous mathematical bounds, as well as the nature of
excited states, and give a global phase diagram of the problem.Comment: 23 pages, 10 figures. Replaced with accepted versio
A variational description of the ground state structure in random satisfiability problems
A variational approach to finite connectivity spin-glass-like models is
developed and applied to describe the structure of optimal solutions in random
satisfiability problems. Our variational scheme accurately reproduces the known
replica symmetric results and also allows for the inclusion of replica symmetry
breaking effects. For the 3-SAT problem, we find two transitions as the ratio
of logical clauses per Boolean variables increases. At the first one
, a non-trivial organization of the solution space in
geometrically separated clusters emerges. The multiplicity of these clusters as
well as the typical distances between different solutions are calculated. At
the second threshold , satisfying assignments disappear
and a finite fraction of variables are overconstrained and
take the same values in all optimal (though unsatisfying) assignments. These
values have to be compared to obtained
from numerical experiments on small instances. Within the present variational
approach, the SAT-UNSAT transition naturally appears as a mixture of a first
and a second order transition. For the mixed -SAT with , the
behavior is as expected much simpler: a unique smooth transition from SAT to
UNSAT takes place at .Comment: 24 pages, 6 eps figures, to be published in Europ. Phys. J.
From principal component to direct coupling analysis of coevolution in proteins: Low-eigenvalue modes are needed for structure prediction
Various approaches have explored the covariation of residues in
multiple-sequence alignments of homologous proteins to extract functional and
structural information. Among those are principal component analysis (PCA),
which identifies the most correlated groups of residues, and direct coupling
analysis (DCA), a global inference method based on the maximum entropy
principle, which aims at predicting residue-residue contacts. In this paper,
inspired by the statistical physics of disordered systems, we introduce the
Hopfield-Potts model to naturally interpolate between these two approaches. The
Hopfield-Potts model allows us to identify relevant 'patterns' of residues from
the knowledge of the eigenmodes and eigenvalues of the residue-residue
correlation matrix. We show how the computation of such statistical patterns
makes it possible to accurately predict residue-residue contacts with a much
smaller number of parameters than DCA. This dimensional reduction allows us to
avoid overfitting and to extract contact information from multiple-sequence
alignments of reduced size. In addition, we show that low-eigenvalue
correlation modes, discarded by PCA, are important to recover structural
information: the corresponding patterns are highly localized, that is, they are
concentrated in few sites, which we find to be in close contact in the
three-dimensional protein fold.Comment: Supporting information can be downloaded from:
http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.100317
Towards finite-dimensional gelation
We consider the gelation of particles which are permanently connected by
random crosslinks, drawn from an ensemble of finite-dimensional continuum
percolation. To average over the randomness, we apply the replica trick, and
interpret the replicated and crosslink-averaged model as an effective molecular
fluid. A Mayer-cluster expansion for moments of the local static density
fluctuations is set up. The simplest non-trivial contribution to this series
leads back to mean-field theory. The central quantity of mean-field theory is
the distribution of localization lengths, which we compute for all
connectivities. The highly crosslinked gel is characterized by a one-to-one
correspondence of connectivity and localization length. Taking into account
higher contributions in the Mayer-cluster expansion, systematic corrections to
mean-field can be included. The sol-gel transition shifts to a higher number of
crosslinks per particle, as more compact structures are favored. The critical
behavior of the model remains unchanged as long as finite truncations of the
cluster expansion are considered. To complete the picture, we also discuss
various geometrical properties of the crosslink network, e.g. connectivity
correlations, and relate the studied crosslink ensemble to a wider class of
ensembles, including the Deam-Edwards distribution.Comment: 18 pages, 4 figures, version to be published in EPJ
- …