50 research outputs found

    Slow nucleic acid unzipping kinetics from sequence-defined barriers

    Full text link
    Recent experiments on unzipping of RNA helix-loop structures by force have shown that about 40-base molecules can undergo kinetic transitions between two well-defined `open' and `closed' states, on a timescale = 1 sec [Liphardt et al., Science 297, 733-737 (2001)]. Using a simple dynamical model, we show that these phenomena result from the slow kinetics of crossing large free energy barriers which separate the open and closed conformations. The dependence of barriers on sequence along the helix, and on the size of the loop(s) is analyzed. Some DNAs and RNAs sequences that could show dynamics on different time scales, or three(or more)-state unzipping, are proposed.Comment: 8 pages Revtex, including 4 figure

    Adaptive Cluster Expansion for Inferring Boltzmann Machines with Noisy Data

    Get PDF
    We introduce a procedure to infer the interactions among a set of binary variables, based on their sampled frequencies and pairwise correlations. The algorithm builds the clusters of variables contributing most to the entropy of the inferred Ising model, and rejects the small contributions due to the sampling noise. Our procedure successfully recovers benchmark Ising models even at criticality and in the low temperature phase, and is applied to neurobiological data.Comment: Accepted for publication in Physical Review Letters (2011

    Large Pseudo-Counts and L2L_2-Norm Penalties Are Necessary for the Mean-Field Inference of Ising and Potts Models

    Full text link
    Mean field (MF) approximation offers a simple, fast way to infer direct interactions between elements in a network of correlated variables, a common, computationally challenging problem with practical applications in fields ranging from physics and biology to the social sciences. However, MF methods achieve their best performance with strong regularization, well beyond Bayesian expectations, an empirical fact that is poorly understood. In this work, we study the influence of pseudo-count and L2L_2-norm regularization schemes on the quality of inferred Ising or Potts interaction networks from correlation data within the MF approximation. We argue, based on the analysis of small systems, that the optimal value of the regularization strength remains finite even if the sampling noise tends to zero, in order to correct for systematic biases introduced by the MF approximation. Our claim is corroborated by extensive numerical studies of diverse model systems and by the analytical study of the mm-component spin model, for large but finite mm. Additionally we find that pseudo-count regularization is robust against sampling noise, and often outperforms L2L_2-norm regularization, particularly when the underlying network of interactions is strongly heterogeneous. Much better performances are generally obtained for the Ising model than for the Potts model, for which only couplings incoming onto medium-frequency symbols are reliably inferred.Comment: 25 pages, 17 figure

    Optimal regularizations for data generation with probabilistic graphical models

    Full text link
    Understanding the role of regularization is a central question in Statistical Inference. Empirically, well-chosen regularization schemes often dramatically improve the quality of the inferred models by avoiding overfitting of the training data. We consider here the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models. Based on analytical calculations on Gaussian multivariate distributions and numerical experiments on Gaussian and Potts models we study the likelihoods of the training, test, and 'generated data' (with the inferred models) sets as functions of the regularization strengths. We show in particular that, at its maximum, the test likelihood and the 'generated' likelihood, which quantifies the quality of the generated samples, have remarkably close values. The optimal value for the regularization strength is found to be approximately equal to the inverse sum of the squared couplings incoming on sites on the underlying network of interactions. Our results seem largely independent of the structure of the true underlying interactions that generated the data, of the regularization scheme considered, and are valid when small fluctuations of the posterior distribution around the MAP estimator are taken into account. Connections with empirical works on protein models learned from homologous sequences are discussed

    Exponentially hard problems are sometimes polynomial, a large deviation analysis of search algorithms for the random Satisfiability problem, and its application to stop-and-restart resolutions

    Full text link
    A large deviation analysis of the solving complexity of random 3-Satisfiability instances slightly below threshold is presented. While finding a solution for such instances demands an exponential effort with high probability, we show that an exponentially small fraction of resolutions require a computation scaling linearly in the size of the instance only. This exponentially small probability of easy resolutions is analytically calculated, and the corresponding exponent shown to be smaller (in absolute value) than the growth exponent of the typical resolution time. Our study therefore gives some theoretical basis to heuristic stop-and-restart solving procedures, and suggests a natural cut-off (the size of the instance) for the restart.Comment: Revtex file, 4 figure

    Solving satisfiability problems by fluctuations: The dynamics of stochastic local search algorithms

    Full text link
    Stochastic local search algorithms are frequently used to numerically solve hard combinatorial optimization or decision problems. We give numerical and approximate analytical descriptions of the dynamics of such algorithms applied to random satisfiability problems. We find two different dynamical regimes, depending on the number of constraints per variable: For low constraintness, the problems are solved efficiently, i.e. in linear time. For higher constraintness, the solution times become exponential. We observe that the dynamical behavior is characterized by a fast equilibration and fluctuations around this equilibrium. If the algorithm runs long enough, an exponentially rare fluctuation towards a solution appears.Comment: 21 pages, 18 figures, revised version, to app. in PRE (2003

    The dynamics of proving uncolourability of large random graphs I. Symmetric Colouring Heuristic

    Full text link
    We study the dynamics of a backtracking procedure capable of proving uncolourability of graphs, and calculate its average running time T for sparse random graphs, as a function of the average degree c and the number of vertices N. The analysis is carried out by mapping the history of the search process onto an out-of-equilibrium (multi-dimensional) surface growth problem. The growth exponent of the average running time is quantitatively predicted, in agreement with simulations.Comment: 5 figure

    Multifractal analysis of perceptron learning with errors

    Full text link
    Random input patterns induce a partition of the coupling space of a perceptron into cells labeled by their output sequences. Learning some data with a maximal error rate leads to clusters of neighboring cells. By analyzing the internal structure of these clusters with the formalism of multifractals, we can handle different storage and generalization tasks for lazy students and absent-minded teachers within one unified approach. The results also allow some conclusions on the spatial distribution of cells.Comment: 11 pages, RevTex, 3 eps figures, version to be published in Phys. Rev. E 01Jan9

    Relaxation and Metastability in the RandomWalkSAT search procedure

    Full text link
    An analysis of the average properties of a local search resolution procedure for the satisfaction of random Boolean constraints is presented. Depending on the ratio alpha of constraints per variable, resolution takes a time T_res growing linearly (T_res \sim tau(alpha) N, alpha < alpha_d) or exponentially (T_res \sim exp(N zeta(alpha)), alpha > alpha_d) with the size N of the instance. The relaxation time tau(alpha) in the linear phase is calculated through a systematic expansion scheme based on a quantum formulation of the evolution operator. For alpha > alpha_d, the system is trapped in some metastable state, and resolution occurs from escape from this state through crossing of a large barrier. An annealed calculation of the height zeta(alpha) of this barrier is proposed. The polynomial/exponentiel cross-over alpha_d is not related to the onset of clustering among solutions.Comment: 23 pages, 11 figures. A mistake in sec. IV.B has been correcte
    corecore