Search CORE

79 research outputs found

Strong functional patterns in the evolution of eukaryotic genomes revealed by the reconstruction of ancestral protein domain repertoires

Author: Godzik Adam
Zmasek Christian M
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genome size and complexity, as measured by the number of genes or protein domains, is remarkably similar in most extant eukaryotes and generally exhibits no correlation with their morphological complexity. Underlying trends in the evolution of the functional content and capabilities of different eukaryotic genomes might be hidden by simultaneous gains and losses of genes. Results We reconstructed the domain repertoires of putative ancestral species at major divergence points, including the last eukaryotic common ancestor (LECA). We show that, surprisingly, during eukaryotic evolution domain losses in general outnumber domain gains. Only at the base of the animal and the vertebrate sub-trees do domain gains outnumber domain losses. The observed gain/loss balance has a distinct functional bias, most strikingly seen during animal evolution, where most of the gains represent domains involved in regulation and most of the losses represent domains with metabolic functions. This trend is so consistent that clustering of genomes according to their functional profiles results in an organization similar to the tree of life. Furthermore, our results indicate that metabolic functions lost during animal evolution are likely being replaced by the metabolic capabilities of symbiotic organisms such as gut microbes. Conclusions While protein domain gains and losses are common throughout eukaryote evolution, losses oftentimes outweigh gains and lead to significant differences in functional profiles. Results presented here provide additional arguments for a complex last eukaryotic common ancestor, but also show a general trend of losses in metabolic capabilities and gain in regulatory complexity during the rise of animals

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Evolution of the protein domain repertoire of eukaryotes reveals strong functional patterns

Author: Godzik Adam
Zmasek Christian M
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

PubMed Central

RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs

Author: Eddy Sean R
Zmasek Christian M
Publication venue: BioMed Central
Publication date: 01/01/2002
Field of study

BACKGROUND: When analyzing protein sequences using sequence similarity searches, orthologous sequences (that diverged by speciation) are more reliable predictors of a new protein's function than paralogous sequences (that diverged by gene duplication). The utility of phylogenetic information in high-throughput genome annotation ("phylogenomics") is widely recognized, but existing approaches are either manual or not explicitly based on phylogenetic trees. RESULTS: Here we present RIO (Resampled Inference of Orthologs), a procedure for automated phylogenomics using explicit phylogenetic inference. RIO analyses are performed over bootstrap resampled phylogenetic trees to estimate the reliability of orthology assignments. We also introduce supplementary concepts that are helpful for functional inference. RIO has been implemented as Perl pipeline connecting several C and Java programs. It is available at http://www.genetics.wustl.edu/eddy/forester/. A web server is at http://www.rio.wustl.edu/. RIO was tested on the Arabidopsis thaliana and Caenorhabditis elegans proteomes. CONCLUSION: The RIO procedure is particularly useful for the automated detection of first representatives of novel protein subfamilies. We also describe how some orthologies can be misleading for functional inference

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Surprising complexity of the ancestral apoptosis network

Author: Godzik Adam
Ye Yuzhen
Zhang Qing
Zmasek Christian M
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

A comparative genomics approach revealed that the genes for several components of the apoptosis network with single copies in vertebrates have multiple paralogs in cnidarian-bilaterian ancestors, suggesting a complex evolutionary history for this network

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

phyloXML: XML for evolutionary biology and comparative genomics

Author: Christian M Zmasek
CM Zmasek
CM Zmasek
CM Zmasek
DR Maddison
E Antezana
J Felsenstein
J Felsenstein
J Leebens-Mack
JA Eisen
JC Avise
JE Stajich
Mira V Han
MW Peterson
N Cannata
N Goto
PJ Cock
Q Zhang
R Gilmour
T Bray
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree branches have lengths and oftentimes support values. Gene trees used in comparative genomics or phylogenomics are usually annotated with taxonomic information, genome-related data, such as gene names and functional annotations, as well as events such as gene duplications, speciations, or exon shufflings, combined with information related to the evolutionary tree itself. The data standards currently used for evolutionary trees have limited capacities to incorporate such annotations of different data types. Results We developed a XML language, named phyloXML, for describing evolutionary trees, as well as various associated data items. PhyloXML provides elements for commonly used items, such as branch lengths, support values, taxonomic names, and gene names and identifiers. By using "property" elements, phyloXML can be adapted to novel and unforeseen use cases. We also developed various software tools for reading, writing, conversion, and visualization of phyloXML formatted data. Conclusion PhyloXML is an XML language defined by a complete schema in XSD that allows storing and exchanging the structures of evolutionary trees as well as associated data. More information about phyloXML itself, the XSD schema, as well as tools implementing and supporting phyloXML, is available at <url>http://www.phyloxml.org</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Divergent evolution of protein conformational dynamics in dihydrofolate reductase.

Author: Bhabha Gira
Dyson H Jane
Ekiert Damian C
Godzik Adam
Jennewein Madeleine
Kroon Gerard
Tuttle Lisa M
Wilson Ian A
Wright Peter E
Zmasek Christian M
Publication venue: eScholarship, University of California
Publication date: 01/11/2013
Field of study

Molecular evolution is driven by mutations, which may affect the fitness of an organism and are then subject to natural selection or genetic drift. Analysis of primary protein sequences and tertiary structures has yielded valuable insights into the evolution of protein function, but little is known about the evolution of functional mechanisms, protein dynamics and conformational plasticity essential for activity. We characterized the atomic-level motions across divergent members of the dihydrofolate reductase (DHFR) family. Despite structural similarity, Escherichia coli and human DHFRs use different dynamic mechanisms to perform the same function, and human DHFR cannot complement DHFR-deficient E. coli cells. Identification of the primary-sequence determinants of flexibility in DHFRs from several species allowed us to propose a likely scenario for the evolution of functionally important DHFR dynamics following a pattern of divergent evolution that is tuned by cellular environment

PubMed Central

eScholarship - University of California

Novel genes dramatically alter regulatory network topology in amphioxus

Author: Dishaw Larry J
Godzik Adam
Litman Gary W
Mueller M Gail
Ye Yuzhen
Zhang Qing
Zmasek Christian M
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Domain rearrangements in the innate immune network of amphioxus suggests that domain shuffling has shaped the evolution of immune systems

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

GreenPhylDB v2.0: comparative and functional genomics in plants

Author: Alonso
Altschul
Ashburner
Bailey
Bowman
Cannon
Carbon
Christelle Aluome
Christian M. Zmasek
Christian Walde
Christophe Périn
Conte
Conte
Craigon
De Bodt
Droc
Enright
Fitch
Flavell
Gabaldon
Gagnot
Gaëtan Droc
Guindon
Han
Hruz
Hulo
Hunter
Jaillon
Kanehisa
Katoh
Kaul
Kuzniar
Lawrence
Liolios
Marie-Angélique Laporte
Mathieu Rouard
Matsuzaki
Matthieu G. Conte
Merchant
Ming
Palenik
Paterson
Pei
Rensing
Salse
Schmutz
Schnable
Schneider
Sequencing ProjectInternational Rice Genome
Swarbreck
Tuskan
Valentin Guignon
Van de Peer
Varshney
Vogel
Waterhouse
Yazaki
Zdobnov
Zmasek
Zmasek
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

GreenPhylDB is a database designed for comparative and functional genomics based on complete genomes. Version 2 now contains sixteen full genomes of members of the plantae kingdom, ranging from algae to angiosperms, automatically clustered into gene families. Gene families are manually annotated and then analyzed phylogenetically in order to elucidate orthologous and paralogous relationships. The database offers various lists of gene families including plant, phylum and species specific gene families. For each gene cluster or gene family, easy access to gene composition, protein domains, publications, external links and orthologous gene predictions is provided. Web interfaces have been further developed to improve the navigation through information related to gene families. New analysis tools are also available, such as a gene family ontology browser that facilitates exploration. GreenPhylDB is a component of the South Green Bioinformatics Platform (http://southgreen.cirad.fr/) and is accessible at http://greenphyl.cirad.fr. It enables comparative genomics in a broad taxonomy context to enhance the understanding of evolutionary processes and thus tends to speed up gene discovery

Crossref

PubMed Central

Agritrop

Phylotastic! Making Tree-of-Life Knowledge Accessible, Reusable and Convenient

Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces. Results: With the aim of building such a "phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (www.phylotastic.org), and a server image. Conclusions: Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.NESCent (the National Evolutionary Synthesis Center)NSF EF-0905606iPlant Collaborative (NSF) DBI-0735191Biodiversity Synthesis Center (BioSync) of the Encyclopedia of LifeComputer Science

Crossref

Springer - Publisher Connector

PubMed Central

DukeSpace

eScholarship - University of California

The University of Arizona

Access to Research at National University of Ireland, Galway

Texas ScholarWorks