325 research outputs found

    Numerical evidence for relevance of disorder in a Poland-Scheraga DNA denaturation model with self-avoidance: Scaling behavior of average quantities

    Full text link
    We study numerically the effect of sequence heterogeneity on the thermodynamic properties of a Poland-Scheraga model for DNA denaturation taking into account self-avoidance, i.e. with exponent c_p=2.15 for the loop length probability distribution. In complement to previous on-lattice Monte Carlo like studies, we consider here off-lattice numerical calculations for large sequence lengths, relying on efficient algorithmic methods. We investigate finite size effects with the definition of an appropriate intrinsic length scale x, depending on the parameters of the model. Based on the occurrence of large enough rare regions, for a given sequence length N, this study provides a qualitative picture for the finite size behavior, suggesting that the effect of disorder could be sensed only with sequence lengths diverging exponentially with x. We further look in detail at average quantities for the particular case x=1.3, ensuring through this parameter choice the correspondence between the off-lattice and the on-lattice studies. Taken together, the various results can be cast in a coherent picture with a crossover between a nearly pure system like behavior for small sizes N < 1000, as observed in the on-lattice simulations, and the apparent asymptotic behavior indicative of disorder relevance, with an (average) correlation length exponent \nu_r >= 2/d (=2).Comment: Latex, 33 pages with 15 postscript figure

    Numerical study of the disordered Poland-Scheraga model of DNA denaturation

    Full text link
    We numerically study the binary disordered Poland-Scheraga model of DNA denaturation, in the regime where the pure model displays a first order transition (loop exponent c=2.15>2c=2.15>2). We use a Fixman-Freire scheme for the entropy of loops and consider chain length up to N=4⋅105N=4 \cdot 10^5, with averages over 10410^4 samples. We present in parallel the results of various observables for two boundary conditions, namely bound-bound (bb) and bound-unbound (bu), because they present very different finite-size behaviors, both in the pure case and in the disordered case. Our main conclusion is that the transition remains first order in the disordered case: in the (bu) case, the disorder averaged energy and contact densities present crossings for different values of NN without rescaling. In addition, we obtain that these disorder averaged observables do not satisfy finite size scaling, as a consequence of strong sample to sample fluctuations of the pseudo-critical temperature. For a given sample, we propose a procedure to identify its pseudo-critical temperature, and show that this sample then obeys first order transition finite size scaling behavior. Finally, we obtain that the disorder averaged critical loop distribution is still governed by P(l)∼1/lcP(l) \sim 1/l^c in the regime l≪Nl \ll N, as in the pure case.Comment: 12 pages, 13 figures. Revised versio

    Prayer: its psychological and philosophical aspects.

    Full text link
    Thesis (M.A.)--Boston Universit

    Evolution of proteomes: fundamental signatures and global trends in amino acid compositions

    Get PDF
    BACKGROUND: The evolutionary characterization of species and lifestyles at global levels is nowadays a subject of considerable interest, particularly with the availability of many complete genomes. Are there specific properties associated with lifestyles and phylogenies? What are the underlying evolutionary trends? One of the simplest analyses to address such questions concerns characterization of proteomes at the amino acids composition level. RESULTS: In this work, amino acid compositions of a large set of 208 proteomes, with significant number of representatives from the three phylogenetic domains and different lifestyles are analyzed, resorting to an appropriate multidimensional method: Correspondence analysis. The analysis reveals striking discrimination between eukaryotes, prokaryotic mesophiles and hyperthemophiles-themophiles, following amino acid usage. In sharp contrast, no similar discrimination is observed for psychrophiles. The observed distributional properties are compared with various inferred chronologies for the recruitment of amino acids into the genetic code. Such comparisons reveal correlations between the observed segregations of species following amino acid usage, and the separation of amino acids following early or late recruitment. CONCLUSION: A simple description of proteomes according to amino acid compositions reveals striking signatures, with sharp segregations or on the contrary non-discriminations following phylogenies and lifestyles. The distribution of species, following amino acid usage, exhibits a discrimination between [high GC]-[high optimal growth temperatures] and [low GC]-[moderate temperatures] characteristics. This discrimination appears to coincide closely with the separation of amino acids following their inferred early or late recruitment into the genetic code. Taken together the various results provide a consistent picture for the evolution of proteomes, in terms of amino acid usage

    Probabilistic sequence alignments: realistic models with efficient algorithms

    Full text link
    Alignment algorithms usually rely on simplified models of gaps for computational efficiency. Based on an isomorphism between alignments and physical helix-coil models, we show in statistical mechanics that alignments with realistic laws for gaps can be computed with fast algorithms. Improved performances of probabilistic alignments with realistic models of gaps are illustrated. Probabilistic and optimization formulations are compared, with potential implications in many fields and perspectives for computationally efficient extensions to Markov models with realistic long-range interactions

    A stitch in time: Efficient computation of genomic DNA melting bubbles

    Get PDF
    Background: It is of biological interest to make genome-wide predictions of the locations of DNA melting bubbles using statistical mechanics models. Computationally, this poses the challenge that a generic search through all combinations of bubble starts and ends is quadratic. Results: An efficient algorithm is described, which shows that the time complexity of the task is O(NlogN) rather than quadratic. The algorithm exploits that bubble lengths may be limited, but without a prior assumption of a maximal bubble length. No approximations, such as windowing, have been introduced to reduce the time complexity. More than just finding the bubbles, the algorithm produces a stitch profile, which is a probabilistic graphical model of bubbles and helical regions. The algorithm applies a probability peak finding method based on a hierarchical analysis of the energy barriers in the Poland-Scheraga model. Conclusions: Exact and fast computation of genomic stitch profiles is thus feasible. Sequences of several megabases have been computed, only limited by computer memory. Possible applications are the genome-wide comparisons of bubbles with promotors, TSS, viral integration sites, and other melting-related regions.Comment: 16 pages, 10 figure

    The Mystery of Two Straight Lines in Bacterial Genome Statistics. Release 2007

    Full text link
    In special coordinates (codon position--specific nucleotide frequencies) bacterial genomes form two straight lines in 9-dimensional space: one line for eubacterial genomes, another for archaeal genomes. All the 348 distinct bacterial genomes available in Genbank in April 2007, belong to these lines with high accuracy. The main challenge now is to explain the observed high accuracy. The new phenomenon of complementary symmetry for codon position--specific nucleotide frequencies is observed. The results of analysis of several codon usage models are presented. We demonstrate that the mean--field approximation, which is also known as context--free, or complete independence model, or Segre variety, can serve as a reasonable approximation to the real codon usage. The first two principal components of codon usage correlate strongly with genomic G+C content and the optimal growth temperature respectively. The variation of codon usage along the third component is related to the curvature of the mean-field approximation. First three eigenvalues in codon usage PCA explain 59.1%, 7.8% and 4.7% of variation. The eubacterial and archaeal genomes codon usage is clearly distributed along two third order curves with genomic G+C content as a parameter.Comment: Significantly extended version with new data for all the 348 distinct bacterial genomes available in Genbank in April 200
    • …
    corecore