9 research outputs found

    eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses

    Get PDF
    eggNOG is a public database of orthology relationships, gene evolutionary histories and functional annotations. Here, we present version 5.0, featuring a major update of the underlying genome sets, which have been expanded to 4445 representative bacteria and 168 archaea derived from 25 038 genomes, as well as 477 eukaryotic organisms and 2502 viral proteomes that were selected for diversity and filtered by genome quality. In total, 4.4M orthologous groups (OGs) distributed across 379 taxonomic levels were computed together with their associated sequence alignments, phylogenies, HMM models and functional descriptors. Precomputed evolutionary analysis provides fine-grained resolution of duplication/speciation events within each OG. Our benchmarks show that, despite doubling the amount of genomes, the quality of orthology assignments and functional annotations (80% coverage) has persisted without significant changes across this update. Finally, we improved eggNOG online services for fast functional annotation and orthology prediction of custom genomics or metagenomics datasets. All precomputed data are publicly available for downloading or via API queries at http://eggnog.embl.de

    Advances and Applications in the Quest for Orthologs

    Get PDF
    Gene families evolve by the processes of speciation (creating orthologs), gene duplication (paralogs) and horizontal gene transfer (xenologs), in addition to sequence divergence and gene loss. Orthologs in particular play an essential role in comparative genomics and phylogenomic analyses. With the continued sequencing of organisms across the tree of life, the data are available to reconstruct the unique evolutionary histories of tens of thousands of gene families. Accurate reconstruction of these histories, however, is a challenging computational problem, and the focus of the Quest for Orthologs Consortium. We review the recent advances and outstanding challenges in this field, as revealed at a symposium and meeting held at the University of Southern California in 2017. Key advances have been made both at the level of orthology algorithm development and with respect to coordination across the community of algorithm developers and orthology end-users. Applications spanned a broad range, including gene function prediction, phylostratigraphy, genome evolution, and phylogenomics. The meetings highlighted the increasing use of meta-analyses integrating results from multiple different algorithms, and discussed ongoing challenges in orthology inference as well as the next steps toward improvement and integration of orthology resources

    Expanding the Orthologous Matrix (OMA) programmatic interfaces: REST API and the OmaDB packages for R and Python.

    Get PDF
    The Orthologous Matrix (OMA) is a well-established resource to identify orthologs among many genomes. Here, we present two recent additions to its programmatic interface, namely a REST API, and user-friendly R and Python packages called OmaDB. These should further facilitate the incorporation of OMA data into computational scripts and pipelines. The REST API can be freely accessed at https://omabrowser.org/api. The R OmaDB package is available as part of Bioconductor at http://bioconductor.org/packages/OmaDB/, and the omadb Python package is available from the Python Package Index (PyPI) at https://pypi.org/project/omadb/

    Gearing up to handle the mosaic nature of life in the quest for orthologs

    No full text
    Abstract Summary: The Quest for Orthologs (QfO) is an open collaboration framework for experts in comparative phylogenomics and related research areas who have an interest in highly accurate orthology predictions and their applications. We here report highlights and discussion points from the QfO meeting 2015 held in Barcelona. Achievements in recent years have established a basis to support developments for improved orthology prediction and to explore new approaches. Central to the QfO effort is proper benchmarking of methods and services, as well as design of standardized datasets and standardized formats to allow sharing and comparison of results. Simultaneously, analysis pipelines have been improved, evaluated and adapted to handle large datasets. All this would not have occurred without the long-term collaboration of Consortium members. Meeting regularly to review and coordinate complementary activities from a broad spectrum of innovative researchers clearly benefits the community. Highlights of the meeting include addressing sources of and legitimacy of disagreements between orthology calls, the context dependency of orthology definitions, special challenges encountered when analyzing very anciently rooted orthologies, orthology in the light of whole-genome duplications, and the concept of orthologous versus paralogous relationships at different levels, including domain-level orthology. Furthermore, particular needs for different applications (e.g. plant genomics, ancient gene families and others) and the infrastructure for making orthology inferences available (e.g. interfaces with model organism databases) were discussed, with several ongoing efforts that are expected to be reported on during the upcoming 2017 QfO meeting

    Gearing up to handle the mosaic nature of life in the quest for orthologs

    Get PDF
    The Quest for Orthologs (QfO) is an open collaboration framework for experts in comparative phylogenomics and related research areas who have an interest in highly accurate orthology predictions and their applications. We here report highlights and discussion points from the QfO meeting 2015 held in Barcelona. Achievements in recent years have established a basis to support developments for improved orthology prediction and to explore new approaches. Central to the QfO effort is proper benchmarking of methods and services, as well as design of standardized datasets and standardized formats to allow sharing and comparison of results. Simultaneously, analysis pipelines have been improved, evaluated, and adapted to handle large datasets. All this would not have occurred without the long-term collaboration of Consortium members. Meeting regularly to review and coordinate complementary activities from a broad spectrum of innovative researchers clearly benefits the community. Highlights of the meeting include addressing sources of and legitimacy of disagreements between orthology calls, the context dependency of orthology definitions, special challenges encountered when analyzing very anciently rooted orthologies, orthology in the light of whole-genome duplications, and the concept of orthologous versus paralogous relationships at different levels, including domain-level orthology. Furthermore, particular needs for different applications (e.g. plant genomics, ancient gene families, and others) and the infrastructure for making orthology inferences available (e.g. interfaces with model organism databases) were discussed, with several ongoing efforts that are expected to be reported on during the upcoming 2017 QfO meeting.This work was supported by Spanish Ministry of Economy and Competitiveness grant BIO2012-37161 (to T.G.), Qatar National Research Fund NPRP 5-298-3-086 (to T.G.), European Research Council grant ERC-2012-StG-310325 (to T.G.), National Institutes of Health (NIH) grant R24 OD011883 (to S.E.L.), U41 HG002273 (to S.E.L. and P.D.T.), Swiss National Science Foundation grant PP00P3_150654 (to C.D.), UK Biotechnology and Biological Sciences Research Council grant BB/M015009/1 (to C.D.), Swiss State Secretariat for Education, Research and Innovation (SERI) funding (to B.B. and C.D.). M.M. and M.P. acknowledge support from the Wellcome Trust (grant number WT108749/Z/15/Z) and the European Molecular Biology Laboratory. The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 222664 (Quantomics)

    The study of plant genome evolution by means of phylogenomics

    Get PDF

    Xenacoelomorpha: The "simple" key to bilaterian ancestry?

    Get PDF
    Xenacoelomorpha (comprising Xenoturbellida, Acoela and Nemertodermatida) is a clade of marine worms whose position in the tree of life is still in debate. Several phylogenetic analyses have shown them to be placed at the base of all bilaterian animals (e. g. chordates, arthropods) or at a more derived position as sister group to the Ambulacraria (echinoderms and hemichordates) within the Bilateria. A key characteristic is the absence of traits found in other bilaterian animals. Orthogroups are groups of orthologous genes found in several organisms. Orthologues are assumed to retain the same function. These functions would be specific to the clade where an orthogroup is prevalent. I investigate a method to automatically establish and validate orthogroups specific to Bilateria, Protostomia and Deuterostomia. These genes could be relevant for the clades’ respective emergence and differences. These sets will also help to ascertain what genes/functions are absent from Xenacoelomorpha. MicroRNAs (miRNAs) are small non-coding RNA molecules involved in RNA silencing and post-transcriptional regulation of gene expression. MiRNAs have not been extens- ively studied in the Xenaceolomorpha. I introduce a fully automatic miRNA detection pipeline to infer and confirm the existence of pre-miRNA sequences in the genome of Xenoturbella bocki as well as predict miRNA candidates from several xenacoel gen- omes. I report previously undetected miRNA families and opine that previous analyses on Acoelomorpha failed due to loss caused by the higher evolutionary rate when compared to the Xenoturbellida
    corecore