Search CORE

19 research outputs found

Assessing the Quality of Whole Genome Alignments in Bacteria

Author: Shamir Ron
Swidan Firas
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2009
Field of study

Comparing genomes is an essential preliminary step to solve many problems in biology. Matching long similar segments between two genomes is a precondition for their evolutionary, genetic, and genome rearrangement analyses. Though various comparison methods have been developed in recent years, a quantitative assessment of their performance is lacking. Here, we describe two families of assessment measures whose purpose is to evaluate bacteria-oriented comparison tools. The first measure is based on how well the genome segmentation fits the gene annotation of the studied organisms; the second uses the number of segments created by the segmentation and the percentage of the two genomes that are conserved. The effectiveness of the two measures is demonstrated by applying them to the results of genome comparison tools obtained on 41 pairs of bacterial species. Despite the difference in the nature of the two types of measurements, both show consistent results, providing insights into the subtle differences between the mapping tools

Crossref

Directory of Open Access Journals

PubMed Central

An Integrative Method for Accurate Comparative Genome Mapping

Author: Eduardo P. C Rocha
Firas Swidan
Michael Shmoish
Pavel Pevzner
Ron Y Pinter
Publication venue: Public Library of Science
Publication date: 01/01/2006
Field of study

We present MAGIC, an integrative and accurate method for comparative genome mapping. Our method consists of two phases: preprocessing for identifying “maximal similar segments,” and mapping for clustering and classifying these segments. MAGIC's main novelty lies in its biologically intuitive clustering approach, which aims towards both calculating reorder-free segments and identifying orthologous segments. In the process, MAGIC efficiently handles ambiguities resulting from duplications that occurred before the speciation of the considered organisms from their most recent common ancestor. We demonstrate both MAGIC's robustness and scalability: the former is asserted with respect to its initial input and with respect to its parameters' values. The latter is asserted by applying MAGIC to distantly related organisms and to large genomes. We compare MAGIC to other comparative mapping methods and provide detailed analysis of the differences between them. Our improvements allow a comprehensive study of the diversity of genetic repertoires resulting from large-scale mutations, such as indels and duplications, including explicitly transposable and phagic elements. The strength of our method is demonstrated by detailed statistics computed for each type of these large-scale mutations. MAGIC enabled us to conduct a comprehensive analysis of the different forces shaping prokaryotic genomes from different clades, and to quantify the importance of novel gene content introduced by horizontal gene transfer relative to gene duplication in bacterial genome evolution. We use these results to investigate the breakpoint distribution in several prokaryotic genomes

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

A Hypothetical Example Demonstrating the Definition of Positional Orthologs and the Emergence of a Nuisance Cross-Overlap

Author: Eduardo P. C Rocha (15100)
Firas Swidan (18791)
Michael Shmoish (6080)
Ron Y Pinter (18792)
Publication venue
Publication date
Field of study

A portion of the genomic segments in a hypothetical cenancestor is denoted by letters. Their orthologous segments in the descendant organisms org1 and org2 are given, using the same letters, but in different font (to stress that the segments, despite being orthologous, are similar but not identical). The scenario described in this example is as follows: a duplication of a genomic segment results in two duplicates b1 and b2 in the cenancestor. During the speciation of org1 and org2 the cenancestor genomic segments are shuffled. The orthologous segments b1 and b1 have similar genomic contexts and are thus positional orthologs. Similary b2 and b2 are positional orthologs as well. When comparatively mapping org1 and org2, one would find that b1 is similar to b2 and b2 is similar to b1. These hits obscure the deduction of the true evolutionary relation between b1 and b1 as well as between b2 and b2, and are referred to as nuisance cross-overlaps. In real biological examples, similar situations arise, e.g., because of rDNAs; see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020075#pcbi-0020075-g002" target="_blank">Figure 2</a>. Notice also that, unlike in sequence alignment, and as is demonstrated in this example, duplications that occurred before the cenancestor (referred to sometimes as outparalogs [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020075#pcbi-0020075-b003" target="_blank">3</a>]) may cause hardships when comparatively mapping two organisms. Thus, nuisance cross-overlaps can be thought of as an “ancestral curse.”</p

FigShare

Histogram of RF Numbers when Comparing S. flexneri 2457t versus S. typhi ty2 under Different Parameter Values

Author: Eduardo P. C Rocha (15100)
Firas Swidan (18791)
Michael Shmoish (6080)
Ron Y Pinter (18792)
Publication venue
Publication date
Field of study

See the section Robustness with respect to parameter values.</p

FigShare

Joining Gaps into Unmatched Regions and Extracting Long Ones from a Sequence Alignment (Step 3 of the Preprocessing Phase)

Author: Eduardo P. C Rocha (15100)
Firas Swidan (18791)
Michael Shmoish (6080)
Ron Y Pinter (18792)
Publication venue
Publication date
Field of study

Gaps ℓ1 and ℓ3 in org1 are joined if the (intragapped) hit ℓ2 is short enough. Assume that ℓ2 and other intra-gapped regions are short enough so that the proximal gaps in org1 are joined to form the unmatched region ℓ. Assume that gaps in org2 are joined similarly to form the unmatched region k, and that the unmatched regions ℓ and k are long enough. If the regions ℓ and k intersect (as in this example), they are joined and the resulting segments in both organisms are extracted to be handled in the next step.</p

FigShare

Number of Detected Unmatched Regions in Step 3 of the Preprocessing Phase (see Figure 3A and Preprocessing Phase: Building a Comprehensive Table of Similar Segments) in the Comparison of S. flexneri 2457t and S. typhi ty2 as a Function of gapJoinLen and of gapExtractLen

Author: Eduardo P. C Rocha (15100)
Firas Swidan (18791)
Michael Shmoish (6080)
Ron Y Pinter (18792)
Publication venue
Publication date
Field of study

<div>(A) A 3-D graph of the function.(B) A projection of the graph as a function of gapExtractLen (horizontal axis) when setting gapJoinLen = 110.(C) A projection of the graph as a function of gapJoinLen (horizontal axis) when setting gapExtractLen = 200.</div

FigShare