375 research outputs found

    The Time Machine: A Simulation Approach for Stochastic Trees

    Full text link
    In the following paper we consider a simulation technique for stochastic trees. One of the most important areas in computational genetics is the calculation and subsequent maximization of the likelihood function associated to such models. This typically consists of using importance sampling (IS) and sequential Monte Carlo (SMC) techniques. The approach proceeds by simulating the tree, backward in time from observed data, to a most recent common ancestor (MRCA). However, in many cases, the computational time and variance of estimators are often too high to make standard approaches useful. In this paper we propose to stop the simulation, subsequently yielding biased estimates of the likelihood surface. The bias is investigated from a theoretical point of view. Results from simulation studies are also given to investigate the balance between loss of accuracy, saving in computing time and variance reduction.Comment: 22 Pages, 5 Figure

    Gene-history correlation and population structure

    Full text link
    Correlation of gene histories in the human genome determines the patterns of genetic variation (haplotype structure) and is crucial to understanding genetic factors in common diseases. We derive closed analytical expressions for the correlation of gene histories in established demographic models for genetic evolution and show how to extend the analysis to more realistic (but more complicated) models of demographic structure. We identify two contributions to the correlation of gene histories in divergent populations: linkage disequilibrium, and differences in the demographic history of individuals in the sample. These two factors contribute to correlations at different length scales: the former at small, and the latter at large scales. We show that recent mixing events in divergent populations limit the range of correlations and compare our findings to empirical results on the correlation of gene histories in the human genome.Comment: Revised and extended version: 26 pages, 5 figures, 1 tabl

    Conformational Mechanics of Polymer Adsorption Transitions at Attractive Substrates

    Full text link
    Conformational phases of a semiflexible off-lattice homopolymer model near an attractive substrate are investigated by means of multicanonical computer simulations. In our polymer-substrate model, nonbonded pairs of monomers as well as monomers and the substrate interact via attractive van der Waals forces. To characterize conformational phases of this hybrid system, we analyze thermal fluctuations of energetic and structural quantities, as well as adequate docking parameters. Introducing a solvent parameter related to the strength of the surface attraction, we construct and discuss the solubility-temperature phase diagram. Apart from the main phases of adsorbed and desorbed conformations, we identify several other phase transitions such as the freezing transition between energy-dominated crystalline low-temperature structures and globular entropy-dominated conformations.Comment: 13 pages, 15 figure

    Evolutionary distances in the twilight zone -- a rational kernel approach

    Get PDF
    Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.Comment: to appear in PLoS ON

    Phylogeny.fr: robust phylogenetic analysis for the non-specialist

    Get PDF
    Phylogenetic analyses are central to many research areas in biology and typically involve the identification of homologous sequences, their multiple alignment, the phylogenetic reconstruction and the graphical representation of the inferred tree. The Phylogeny.fr platform transparently chains programs to automatically perform these tasks. It is primarily designed for biologists with no experience in phylogeny, but can also meet the needs of specialists; the first ones will find up-to-date tools chained in a phylogeny pipeline to analyze their data in a simple and robust way, while the specialists will be able to easily build and run sophisticated analyses. Phylogeny.fr offers three main modes. The ‘One Click’ mode targets non-specialists and provides a ready-to-use pipeline chaining programs with recognized accuracy and speed: MUSCLE for multiple alignment, PhyML for tree building, and TreeDyn for tree rendering. All parameters are set up to suit most studies, and users only have to provide their input sequences to obtain a ready-to-print tree. The ‘Advanced’ mode uses the same pipeline but allows the parameters of each program to be customized by users. The ‘A la Carte’ mode offers more flexibility and sophistication, as users can build their own pipeline by selecting and setting up the required steps from a large choice of tools to suit their specific needs. Prior to phylogenetic analysis, users can also collect neighbors of a query sequence by running BLAST on general or specialized databases. A guide tree then helps to select neighbor sequences to be used as input for the phylogeny pipeline. Phylogeny.fr is available at: http://www.phylogeny.fr

    Nature of the quantum phase transitions in the two-dimensional hardcore boson model

    Full text link
    We use two Quantum Monte Carlo algorithms to map out the phase diagram of the two-dimensional hardcore boson Hubbard model with near (V1V_1) and next near (V2V_2) neighbor repulsion. At half filling we find three phases: Superfluid (SF), checkerboard solid and striped solid depending on the relative values of V1V_1, V2V_2 and the kinetic energy. Doping away from half filling, the checkerboard solid undergoes phase separation: The superfluid and solid phases co-exist but not as a single thermodynamic phase. As a function of doping, the transition from the checkerboard solid is therefore first order. In contrast, doping the striped solid away from half filling instead produces a striped supersolid phase: Co-existence of density order with superfluidity as a single phase. One surprising result is that the entire line of transitions between the SF and checkerboard solid phases at half filling appears to exhibit dynamical O(3) symmetry restoration. The transitions appear to be in the same universality class as the special Heisenberg point even though this symmetry is explicitly broken by the V2V_2 interaction.Comment: 10 pages, 14 eps figures, include

    Mitochondrial phylogeography and demographic history of the Vicuña: implications for conservation

    Get PDF
    The vicuña (Vicugna vicugna; Miller, 1924) is a conservation success story, having recovered from near extinction in the 1960s to current population levels estimated at 275 000. However, lack of information about its demographic history and genetic diversity has limited both our understanding of its recovery and the development of science-based conservation measures. To examine the evolution and recent demographic history of the vicuña across its current range and to assess its genetic variation and population structure, we sequenced mitochondrial DNA from the control region (CR) for 261 individuals from 29 populations across Peru, Chile and Argentina. Our results suggest that populations currently designated as Vicugna vicugna vicugna and Vicugna vicugna mensalis comprise separate mitochondrial lineages. The current population distribution appears to be the result of a recent demographic expansion associated with the last major glacial event of the Pleistocene in the northern (18 to 22°S) dry Andes 14–12 000 years ago and the establishment of an extremely arid belt known as the 'Dry Diagonal' to 29°S. Within the Dry Diagonal, small populations of V. v. vicugna appear to have survived showing the genetic signature of demographic isolation, whereas to the north V. v. mensalis populations underwent a rapid demographic expansion before recent anthropogenic impacts

    Directed Loop Updates for Quantum Lattice Models

    Full text link
    This article outlines how the quantum Monte Carlo directed loop update recently introduced can be applied to a wide class of quantum lattice models. Several models are considered: Spin-S XXZ models with longitudinal and transverse magnetic fields, boson models with two-body interactions, and 1D spinful fermion models. Expressions are given for the parameter regimes were very efficient "no-bounce" quantum Monte Carlo algorithms can be found.Comment: 18 pages, 19 figure

    A new, fast algorithm for detecting protein coevolution using maximum compatible cliques

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The MatrixMatchMaker algorithm was recently introduced to detect the similarity between phylogenetic trees and thus the coevolution between proteins. MMM finds the largest common submatrices between pairs of phylogenetic distance matrices, and has numerous advantages over existing methods of coevolution detection. However, these advantages came at the cost of a very long execution time.</p> <p>Results</p> <p>In this paper, we show that the problem of finding the maximum submatrix reduces to a multiple maximum clique subproblem on a graph of protein pairs. This allowed us to develop a new algorithm and program implementation, MMMvII, which achieved more than 600× speedup with comparable accuracy to the original MMM.</p> <p>Conclusions</p> <p>MMMvII will thus allow for more more extensive and intricate analyses of coevolution.</p> <p>Availability</p> <p>An implementation of the MMMvII algorithm is available at: <url>http://www.uhnresearch.ca/labs/tillier/MMMWEBvII/MMMWEBvII.php</url></p
    corecore