19 research outputs found

    Assessing the Quality of Whole Genome Alignments in Bacteria

    Get PDF
    Comparing genomes is an essential preliminary step to solve many problems in biology. Matching long similar segments between two genomes is a precondition for their evolutionary, genetic, and genome rearrangement analyses. Though various comparison methods have been developed in recent years, a quantitative assessment of their performance is lacking. Here, we describe two families of assessment measures whose purpose is to evaluate bacteria-oriented comparison tools. The first measure is based on how well the genome segmentation fits the gene annotation of the studied organisms; the second uses the number of segments created by the segmentation and the percentage of the two genomes that are conserved. The effectiveness of the two measures is demonstrated by applying them to the results of genome comparison tools obtained on 41 pairs of bacterial species. Despite the difference in the nature of the two types of measurements, both show consistent results, providing insights into the subtle differences between the mapping tools

    An Integrative Method for Accurate Comparative Genome Mapping

    Get PDF
    We present MAGIC, an integrative and accurate method for comparative genome mapping. Our method consists of two phases: preprocessing for identifying “maximal similar segments,” and mapping for clustering and classifying these segments. MAGIC's main novelty lies in its biologically intuitive clustering approach, which aims towards both calculating reorder-free segments and identifying orthologous segments. In the process, MAGIC efficiently handles ambiguities resulting from duplications that occurred before the speciation of the considered organisms from their most recent common ancestor. We demonstrate both MAGIC's robustness and scalability: the former is asserted with respect to its initial input and with respect to its parameters' values. The latter is asserted by applying MAGIC to distantly related organisms and to large genomes. We compare MAGIC to other comparative mapping methods and provide detailed analysis of the differences between them. Our improvements allow a comprehensive study of the diversity of genetic repertoires resulting from large-scale mutations, such as indels and duplications, including explicitly transposable and phagic elements. The strength of our method is demonstrated by detailed statistics computed for each type of these large-scale mutations. MAGIC enabled us to conduct a comprehensive analysis of the different forces shaping prokaryotic genomes from different clades, and to quantify the importance of novel gene content introduced by horizontal gene transfer relative to gene duplication in bacterial genome evolution. We use these results to investigate the breakpoint distribution in several prokaryotic genomes

    A Hypothetical Example Demonstrating the Definition of Positional Orthologs and the Emergence of a Nuisance Cross-Overlap

    No full text
    <p>A portion of the genomic segments in a hypothetical cenancestor is denoted by letters. Their orthologous segments in the descendant organisms org1 and org2 are given, using the same letters, but in different font (to stress that the segments, despite being orthologous, are similar but not identical). The scenario described in this example is as follows: a duplication of a genomic segment results in two duplicates <i>b</i><sub>1</sub> and <i>b</i><sub>2</sub> in the cenancestor. During the speciation of org1 and org2 the cenancestor genomic segments are shuffled. The orthologous segments <b>b1</b> and b1 have similar genomic contexts and are thus positional orthologs. Similary <b>b2</b> and b2 are positional orthologs as well. When comparatively mapping org1 and org2, one would find that <b>b1</b> is similar to b2 and <b>b2</b> is similar to b1. These hits obscure the deduction of the true evolutionary relation between <b>b1</b> and b1 as well as between <b>b2</b> and b2, and are referred to as nuisance cross-overlaps. In real biological examples, similar situations arise, e.g., because of rDNAs; see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020075#pcbi-0020075-g002" target="_blank">Figure 2</a>. Notice also that, unlike in sequence alignment, and as is demonstrated in this example, duplications that occurred <i>before</i> the cenancestor (referred to sometimes as <i>outparalogs</i> [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020075#pcbi-0020075-b003" target="_blank">3</a>]) may cause hardships when comparatively mapping two organisms. Thus, nuisance cross-overlaps can be thought of as an “ancestral curse.”</p

    Joining Gaps into Unmatched Regions and Extracting Long Ones from a Sequence Alignment (Step 3 of the Preprocessing Phase)

    No full text
    <p>Gaps <i>ℓ</i><sub>1</sub> and <i>ℓ</i><sub>3</sub> in org1 are joined if the (intragapped) hit <i>ℓ</i><sub>2</sub> is short enough. Assume that <i>ℓ</i><sub>2</sub> and other intra-gapped regions are short enough so that the proximal gaps in org1 are joined to form the unmatched region <i>ℓ</i>. Assume that gaps in org2 are joined similarly to form the unmatched region <i>k,</i> and that the unmatched regions <i>ℓ</i> and <i>k</i> are long enough. If the regions <i>ℓ</i> and <i>k</i> intersect (as in this example), they are joined and the resulting segments in both organisms are extracted to be handled in the next step.</p

    Number of Detected Unmatched Regions in Step 3 of the Preprocessing Phase (see Figure 3A and Preprocessing Phase: Building a Comprehensive Table of Similar Segments) in the Comparison of S. flexneri 2457t and S. typhi ty2 as a Function of <i>gapJoinLen</i> and of <i>gapExtractLen</i>

    No full text
    <div><p>(A) A 3-D graph of the function.</p><p>(B) A projection of the graph as a function of <i>gapExtractLen</i> (horizontal axis) when setting <i>gapJoinLen</i> = 110.</p><p>(C) A projection of the graph as a function of <i>gapJoinLen</i> (horizontal axis) when setting <i>gapExtractLen</i> = 200.</p></div
    corecore