3 research outputs found

    HPC: Hierarchical phylogeny construction

    Get PDF
    Rapid improvements in DNA sequencing technology have resulted in long genome sequences for a large number of similar isolates with a wide range of single nucleotide polymorphism (SNP) rates, where some isolates can have thousands of times lower SNP rates than others. Genome sequences of this kind are a challenge to existing methods for construction of phylogenetic trees. We address the issues by developing a hierarchical approach to phylogeny construction. In this method, the construction is performed at multiple levels, where at each level, groups of isolates with similar levels of similarity are identified and their phylogenetic trees are constructed. Time savings are achieved by using a sufficiently large number of columns from the input alignment, instead of all its columns. Our results show that the new approach is 20-60 times more efficient than existing programs and more accurate in situations where highly similar isolates have a wide range of SNP rates

    Hierarchical phylogeny construction

    Get PDF
    Construction of a phylogenetic tree for a number of species from their genome sequence is very important for understanding the evolutionary history of those species. Rapid improvements in DNA sequencing technology have generated sequence data for huge number of similar isolates with a wide range of single nucleotide polymorphism (SNP) rates, where the SNP rate among some isolates can be thousands of times lower than the others. This kind of genome sequences are difficult for the existing methods because the subtree(s) (or clade) consisting of species or isolates with very low SNP rates may have a very low level of resolution and their evolutionary history may not be accurately represented. Identification of the informative columns in the alignment containing important variations in the genome of those species is important in constructing their evolutionary history. Here we describe a method for selecting informative regions for a set of isolates based on the observation that the likelihood of informative columns are sensitive to changes in the tree topology. We show that these informative columns increase the correctness of the phylogenies constructed for the closely related isolates. Then we address the generalized version of this problem by developing a hierarchical approach to phylogeny construction. In this method, the construction is performed at multiple levels, where at each level, groups of isolates with similar levels of similarity are identified and their phylogenetic trees are constructed. We also detect those multiple levels of similarity in an automated manner. Our results show that this new hierarchical approach is much efficient and sometimes more accurate than existing approaches of building the phylogenetic tree with maximum likelihood from the whole alignment for all the isolates
    corecore