10 research outputs found

    Biased landscapes for random Constraint Satisfaction Problems

    Full text link
    The typical complexity of Constraint Satisfaction Problems (CSPs) can be investigated by means of random ensembles of instances. The latter exhibit many threshold phenomena besides their satisfiability phase transition, in particular a clustering or dynamic phase transition (related to the tree reconstruction problem) at which their typical solutions shatter into disconnected components. In this paper we study the evolution of this phenomenon under a bias that breaks the uniformity among solutions of one CSP instance, concentrating on the bicoloring of k-uniform random hypergraphs. We show that for small k the clustering transition can be delayed in this way to higher density of constraints, and that this strategy has a positive impact on the performances of Simulated Annealing algorithms. We characterize the modest gain that can be expected in the large k limit from the simple implementation of the biasing idea studied here. This paper contains also a contribution of a more methodological nature, made of a review and extension of the methods to determine numerically the discontinuous dynamic transition threshold.Comment: 32 pages, 16 figure

    Small Coupling Expansion for Multiple Sequence Alignment

    Full text link
    The alignment of biological sequences such as DNA, RNA, and proteins, is one of the basic tools that allow to detect evolutionary patterns, as well as functional/structural characterizations between homologous sequences in different organisms. Typically, state-of-the-art bioinformatics tools are based on profile models that assume the statistical independence of the different sites of the sequences. Over the last years, it has become increasingly clear that homologous sequences show complex patterns of long-range correlations over the primary sequence as a consequence of the natural evolution process that selects genetic variants under the constraint of preserving the functional/structural determinants of the sequence. Here, we present a new alignment algorithm based on message passing techniques that overcomes the limitations of profile models. Our method is based on a new perturbative small-coupling expansion of the free energy of the model that assumes a linear chain approximation as the 0th0^\mathrm{th}-order of the expansion. We test the potentiality of the algorithm against standard competing strategies on several biological sequences.Comment: 20 pages, 9 figure

    The asymptotics of the clustering transition for random constraint satisfaction problems

    Full text link
    Random Constraint Satisfaction Problems exhibit several phase transitions when their density of constraints is varied. One of these threshold phenomena, known as the clustering or dynamic transition, corresponds to a transition for an information theoretic problem called tree reconstruction. In this article we study this threshold for two CSPs, namely the bicoloring of kk-uniform hypergraphs with a density α\alpha of constraints, and the qq-coloring of random graphs with average degree cc. We show that in the large k,qk,q limit the clustering transition occurs for α=2k1k(lnk+lnlnk+γd+o(1))\alpha = \frac{2^{k-1}}{k} (\ln k + \ln \ln k + \gamma_{\rm d} + o(1)), c=q(lnq+lnlnq+γd+o(1))c= q (\ln q + \ln \ln q + \gamma_{\rm d}+ o(1)), where γd\gamma_{\rm d} is the same constant for both models. We characterize γd\gamma_{\rm d} via a functional equation, solve the latter numerically to estimate γd0.871\gamma_{\rm d} \approx 0.871, and obtain an analytic lowerbound γd1+ln(2(21))0.812\gamma_{\rm d} \ge 1 + \ln (2 (\sqrt{2}-1)) \approx 0.812. Our analysis unveils a subtle interplay of the clustering transition with the rigidity (naive reconstruction) threshold that occurs on the same asymptotic scale at γr=1\gamma_{\rm r}=1.Comment: 35 pages, 8 figure

    Statistical mechanics of inference in epidemic spreading

    Get PDF
    We investigate the information-theoretical limits of inference tasks in epidemic spreading on graphs in the thermodynamic limit. The typical inference tasks consist in computing observables of the posterior distribution of the epidemic model given observations taken from a ground-truth (sometimes called planted) random trajectory. We can identify two main sources of quenched disorder: the graph ensemble and the planted trajectory. The epidemic dynamics however induces nontrivial long-range correlations among individuals' states on the latter. This results in nonlocal correlated quenched disorder which unfortunately is typically hard to handle. To overcome this difficulty, we divide the dynamical process into two sets of variables: a set of stochastic independent variables (representing transmission delays), plus a set of correlated variables (the infection times) that depend deterministically on the first. Treating the former as quenched variables and the latter as dynamic ones, computing disorder average becomes feasible by means of the replica-symmetric cavity method. We give theoretical predictions on the posterior probability distribution of the trajectory of each individual, conditioned to observations on the state of individuals at given times, focusing on the susceptible infectious (SI) model. In the Bayes-optimal condition, i.e., when true dynamic parameters are known, the inference task is expected to fall in the replica-symmetric regime. We indeed provide predictions for the information theoretic limits of various inference tasks, in form of phase diagrams. We also identify a region, in the Bayes-optimal setting, with strong hints of replica-symmetry breaking. When true parameters are unknown, we show how a maximum-likelihood procedure is able to recover them with mostly unaffected performance

    Statistical Mechanics of Inference in Epidemic Spreading

    Full text link
    We study the feasibility of inference tasks in epidemic spreading on graphs in the thermodynamic limit. We identify two main sources of disorder which need to be averaged over: the graph ensemble and the set of epidemic individual's trajectories, from which observations are taken with a fixed protocol. The dynamics on the contact graph induces non-trivial long-range correlations among individuals' states. This results in non-local correlated quenched disorder which unfortunately is typically hard to handle. To overcome this difficulty, we divide the dynamical process into two sets of variables: a set of stochastic independent variables (representing the infection delays), plus a set of correlated variables (the infection times) that depend deterministically on the first. Treating the former as quenched variables and the latter as dynamic ones, the disorder average becomes feasible by means of the Replica Symmetric cavity method. We give theoretical predictions on the posterior probability distribution of the trajectory of each individual, conditioned on (partial and noisy) observations on the state of individuals at given times, focusing on the Susceptible Infected (SI) model. In the Bayes-optimal condition, i.e. when true dynamic parameters are known, the inference task is expected to fall in the Replica Symmetric regime. We indeed provide predictions for the information theoretic limits of various inference tasks, in form of phase diagrams. When true parameters are unknown, we show how a maximum-likelihood procedure is able to recover them with mostly unaffected performance

    Barrières algorithmiques dans les problèmes aléatoires de satisfaction de contraintes

    No full text
    The typical complexity of Constraint Satisfaction Problems (CSP) can be studied using random ensembles of instances. One observes threshold phenomena when the density of constraints increases, in particular a clustering phase transition at which typical solutions shatter into disconnected components. In this PhD, we introduce a bias that breaks the uniformity among solutions of a given instance of CSP, and look at the evolution of the clustering threshold under this bias, focusing on the bicoloring of k-uniform random hypergraphs. For small values of k, we show that this bias can delay the clustering transition to higher densities of constraints, and that it has a positive impact on the performances of Simulated Annealing algorithm to find a solution for a given instance of the bicoloring problem. In the large k limit, we compute the asymptotic expansion of the clustering threshold for the uniform and the biased measure, and characterize the gain obtained with our implementation of the bias.La complexité typique des Problèmes de Satisfaction de Contraintes (CSP) peut être étudiée à l'aide d'ensembles aléatoires de contraintes. On observe un phénomène de seuil quand la densité de contraintes augmente. En particulier à la transition de clustering, l'ensemble des solutions typiques se fracture en groupes de solutions séparés les uns des autres. Dans cette thèse nous introduisons un biais qui brise l'uniformité entre les solutions d'une instance de CSP, et nous étudions son effet sur la valeur du seuil de clustering. Nous étudions en particulier le problème de bicoloriage de k-hypergraphes. Pour de petites valeurs de k, nous montrons que ce biais peut augmenter la valeur du seuil clustering, et que cela a un effet positif sur les performances de l'algorithme de Simulated Annealing pour la recherche de solutions d'une instance du problème de bicoloriage. Dans la limite où k tend vers l'infini, nous calculons le développement asymptotique du seuil de clustering pour la mesure uniforme et pour une mesure biaisée. Nous évaluons le gain obtenu avec cette implémentation du biais
    corecore