Location of Repository

Large-Scale Trends in the Evolution of Gene Structures within 11 Animal Genomes

By Mark Yandell, Chris J Mungall, Chris Smith, Simon Prochnik, Joshua Kaminker, George Hartzell, Suzanna Lewis and Gerald M Rubin


We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans) together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales—from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for “Comparative Genomics Library”). Our results demonstrate that change in intron–exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities

Topics: Research Article
Publisher: Public Library of Science
OAI identifier: oai:pubmedcentral.nih.gov:1386723
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles



    1. (2006). (1999–2006) Spearman Rank Correlation Coefficient.
    2. (1994). A codon-based model of nucleotide substitution for protein-coding DNA sequences.
    3. (1998). A computer program for aligning a cDNA sequence with a genomic DNA sequence.
    4. (1991). Amino acid substitution matrices from an information theoretic perspective.
    5. (1992). Amino acid substitution matrices from protein blocks.
    6. (2002). An integrated computational pipeline and database to support whole-genome sequence annotation.
    7. (1995). Animal evolution: Interrelationships of the living phyla.
    8. (2002). Annotation of the Drosophila melanogaster euchromatic genome: A systematic review.
    9. (2000). Arabidopsis Genome Initiative
    10. (1990). Basic local alignment search tool.
    11. (2003). BLAST: An essential guide to the Basic Local Alignment Search Tool. Sebastopol (California): O’Reilly & Associates.
    12. (1996). Bootstrap confidence levels for phylogenetic trees.
    13. (2005). Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution.
    14. (2000). Comparative genomics of the eukaryotes.
    15. (2000). Conservation, regulation, synteny, and introns PLoS Computational Biology | www.ploscompbiol.org
    16. (1996). Determining divergence times of the major kingdoms of living organisms with a protein clock.
    17. (1998). elegans Sequencing Consortium
    18. (2004). Ensembl
    19. (2000). Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models.
    20. (1997). Evidence for a clade of nematodes, arthropods and other moulting animals.
    21. (1998). Five kingdoms: An illustrated guide to the phyla of life on earth.
    22. (1997). FlyBase: A Drosophila database. The FlyBase consortium.
    23. (2005). FlyBase: Genes and gene models.
    24. (1987). Fossil appendicularians in the Early Cambrian.
    25. (2005). Genome-scale evidence of the nematodearthropod clade.
    26. (2003). Genomic clocks and evolutionary timescales.
    27. (2004). Indel-based evolutionary distance and mouse-human divergence.
    28. (2001). Initial sequencing and analysis of the human genome.
    29. (2002). Initial sequencing and comparative analysis of the mouse genome.
    30. (1996). Molecular evidence for deep pre-Cambrian divergences among the metazoan phyla.
    31. (2004). Molecular evolution: Introns fall into place.
    32. (2004). Origins of recently gained introns in Caenorhabditis.
    33. (1993). PHYLIP—Phylogeny Inference Package (Version 3.2).
    34. (2005). Phylogenetic mapping of intron positions: A case study of translation initiation factor eIF2gamma.
    35. (2004). Prevalence of intron gain over intron loss in the evolution of paralogous gene families.
    36. (2003). Proposal for Drosophila as a model system for comparative genomics. Available: http:// www.genome.gov/Pages/Research/Sequencing/SeqProposals/Drosophila.pdf.
    37. (2003). Proposal for the sequencing of Drosophila yakuba and
    38. (2005). Resolution of a deep animal divergence by the pattern of intron conservation.
    39. (2002). Retroposed new genes out of the X in Drosophila.
    40. (2003). Selective constraints on intron evolution in Drosophila.
    41. (1992). Splicing signals in Drosophila: Intron size, information content, and consensus sequences.
    42. (2004). Stoltzfus A
    43. (2004). Temporal patterns of fruit fly (Drosophila) evolution revealed by mutation clocks.
    44. (2002). The Bioperl toolkit: Perl modules for the life sciences.
    45. (2002). The draft genome of Ciona intestinalis: Insights into chordate and vertebrate origins.
    46. (2002). The evolution of spliceosomal introns.
    47. (2002). The genome sequence of the malaria mosquito Anopheles gambiae.
    48. (2001). The sequence of the human genome.
    49. (2005). The Sequence Ontology: A tool for the unification of genome annotations.
    50. (2002). The signal of ancient introns is obscured by intron density and homolog number.
    51. (2002). The transposable elements of the Drosophila melanogaster euchromatin: A genomics perspective.
    52. (1998). Toward a resolution of the introns early/late debate: Only phase zero introns are correlated with the structure of ancient proteins.
    53. (2005). WormBase: A comprehensive data resource for Caenorhabditis biology and genomics.

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.