12 research outputs found

    Next-generation anchor based phylogeny (NexABP): constructing phylogeny from next-generation sequencing data

    Get PDF
    Whole genome sequences are ideally suited for deriving evolutionary relationship among organisms. With the availability of Next Generation sequencing (NGS) datasets in an unprecedented scale, it will be highly desirable if phylogenetic analysis can be carried out using short read NGS data. We described here an anchor based approach NexABP for phylogenetic construction of closely related strains/isolates from NGS data. This approach can be used even in the absence of a fully assembled reference genome and works by reducing the complexity of the datasets without compromising results. NexABP was used for constructing phylogeny of different strains of some of the common pathogens, such as Mycobacterium tuberculosis, Vibrio cholera and Escherichia coli. In addition to classification into distinct lineages, NexABP could resolve inner branches and also allow statistical testing using bootstrap analysis. We believe that there are some clear advantages of using NexABP based phylogenetic analysis as compared to other methods

    MGDD: Mycobacterium tuberculosis Genome Divergence Database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Variation in genomes among different closely-related organisms can be linked to phenotypic differences. A number of mechanisms, such as replication error, repeat expansion and contraction, recombination and transposition can contribute to genomic differences. These processes lead to generation of SNPs, different types of repeat-based and transposons or IS-element-based polymorphisms, inversions and duplications and changes in synteny. A database of all the variations in a group of organisms is not only useful for understanding genotype-phenotype relationship but also in clinical applications. There is no database available at present that provides information about detailed genomic variations among different strains and species of <it>Mycobacterium tuberculosis </it>complex, organisms responsible for human diseases.</p> <p>Description</p> <p>MGDD is a free web-based database that allows quick user friendly search to find different types of genomic variations among a group of fully sequenced organisms belonging to <it>M. tuberculosis </it>complex. The searches are based on data generated by pair wise comparison using a tool that has already been described. Different types of variations that can be searched are SNPs, indels, tandem repeats and divergent regions. The searches can be designed to find specific variations either in a given gene or any given location of the query genome with respect to any other genome currently available.</p> <p>Conclusion</p> <p>Web-based database MGDD can help to find all the possible differences that exists between two strains or species of <it>M. tuberculosis </it>complex. The search tool is very user-friendly and can be used by anyone not familiar with computational methods and will be useful to both clinicians and researchers working on tuberculosis and other Mycobacterial diseases.</p

    Anchor-Based Whole Genome Phylogeny (ABWGP): A Tool for Inferring Evolutionary Relationship among Closely Related Microorganims

    Get PDF
    Phenotypic behavior of a group of organisms can be studied using a range of molecular evolutionary tools that help to determine evolutionary relationships. Traditionally a gene or a set of gene sequences was used for generating phylogenetic trees. Incomplete evolutionary information in few selected genes causes problems in phylogenetic tree construction. Whole genomes are used as remedy. Now, the task is to identify the suitable parameters to extract the hidden information from whole genome sequences that truly represent evolutionary information. In this study we explored a random anchor (a stretch of 100 nucleotides) based approach (ABWGP) for finding distance between any two genomes, and used the distance estimates to compute evolutionary trees. A number of strains and species of Mycobacteria were used for this study. Anchor-derived parameters, such as cumulative normalized score, anchor order and indels were computed in a pair-wise manner, and the scores were used to compute distance/phylogenetic trees. The strength of branching was determined by bootstrap analysis. The terminal branches are clearly discernable using the distance estimates described here. In general, different measures gave similar trees except the trees based on indels. Overall the tree topology reflected the known biology of the organisms. This was also true for different strains of Escherichia coli. A new whole genome-based approach has been described here for studying evolutionary relationships among bacterial strains and species

    Episodes of horizontal gene-transfer and gene-fusion led to co-existence of different metal-ion specific glyoxalase I

    Get PDF
    Glyoxalase pathway plays an important role in stress adaptation and many clinical disorders. The first enzyme of this pathway, glyoxalase I (GlxI), uses methylglyoxal as a substrate and requires either Ni(II)/Co(II) or Zn(II) for activity. Here we have investigated the origin of different metal ion specificities of GlxI and subsequent pattern of inheritance during evolution. Our results suggest a primitive origin of single-domain Ni dependent GlxI [Ni-GlxI]. This subsequently evolved into Zn activated GlxI [Zn-GlxI] in deltaproteobacteria. However, origin of eukaryotic Zn-GlxI is different and can be traced to GlxI from Candidatus pelagibacter and Sphingomonas. In eukaryotes GlxI has evolved as two-domain protein but the corresponding Zn form is lost in plants/higher eukaryotes. In plants gene expansion has given rise to multiple two-domain Ni-GlxI which are differentially regulated under abiotic stress conditions. Our results suggest that different forms of GlxI have evolved to help plants adapt to stress

    Comparative analysis of bacterial genomes: identification of divergent regions in mycobacterial strains using an anchor-based approach

    Get PDF
    Comparative genomic approaches are useful in identifying molecular differences between organisms. Currently available methods fail to identify small changes in genomes, such as expansion of short repetitive motifs and to analyse divergent sequences. In this report, we describe an anchor-based whole genome comparison (ABWGC) method. ABWGC is based on random sampling of anchor sequences from one genome, followed by analysis of sampled and homologous regions from the target genome. The method was applied to compare two strains of Mycobacterium tuberculosis CDC1551 and H37Rv. ABWGC was able to identify a total of 104 indels including 20 expansion of short repetitive sequences and five recombination events. It included 18 new unidentified genomic differences. ABWGC also identified 188 SNPs including eight new ones. The method was also used to compare M. tuberculosis H37Rv and M. avium genomes. ABWGC was able to correctly pick 1002 additional indels (size >100 nt) between the two organisms in contrast to MUMmer, a popular tool for comparative genomics. ABWGC was able to identify correctly repeat expansion and indels in a set of simulated sequences. The study also revealed important role of small repeat expansion in the evolution of M. tuberculosis strains

    ABWGAT: anchor-based whole genome analysis tool

    No full text
    Large numbers of genomes are being sequenced regularly and the rate will go up in future due to availability of new genome sequencing techniques. In order to understand genotype to phenotype relationships, it is necessary to identify sequence variations at the genomic level. Alignment of a pair of genomes and parsing the alignment data is an accepted approach for identification of variations. Though there are a number of tools available for whole-genome alignment, none of these allows automatic parsing of the alignment and identification of different kinds of genomic variants with high degree of sensitivity. Here we present a simple web-based interface for whole genome comparison named ABWGAT (Anchor-Based Whole Genome Analysis Tool) that is simple to use. The output is a list of variations such as SNVs, indels, repeat expansion and inversion
    corecore