151 research outputs found

    The Hymenoptera Genome Database

    Get PDF
    The Hymenoptera Genome Database (HGD) is an informatics resource supporting genomics of hymenopteran insect species. This relational database implements open-source software and components providing access to curated data contributed by an extensive, active research community. HGD includes the genome sequences and annotation data of honey bee _Apis mellifera_ and its pathogens ("http://BeeBase.org":BeeBase.org) the parasitoid wasp _Nasonia vitripennis_ ("http://NasoniaBase.org":NasoniaBase.org) and a portal to the genomes of six species of ants. Together, these species cover approximately 200 MY in the phylogeny of Hymenoptera, allowing to leverage genetic, genome sequence, and gene expression data, as well as the biological knowledge of related model organisms. The availability of resources across an order greatly facilitates comparative genomics and enhances our understanding of the biology of agriculturally important Hymenoptera species through genomics. HGD has supported research contributions from an extensive community from almost 80 institutions in 14 countries. Community annotation efforts are made possible thanks to a remote connection to a Chado database by Apollo Genome Annotation client software. Curated data at HGD includes predicted and annotated gene sets supported with evidence tracks such as ESTs/cDNAs, small RNA sequences and GC composition domains. Data at HGD can be queried using genome browsers and / or BLAST/PSI-BLAST servers, and it may also be downloaded to perform local searches. We encourage the public to access and contribute data to HGD at "http://HymenopteraGenome.org":HymenopteraGenome.org.

This poster contains material included in an article accepted for publication in Nucl. Acids Res.©: 2011. The Database Issue. Published by Oxford University Press

    A quick guide for student-driven community genome annotation

    Full text link
    High quality gene models are necessary to expand the molecular and genetic tools available for a target organism, but these are available for only a handful of model organisms that have undergone extensive curation and experimental validation over the course of many years. The majority of gene models present in biological databases today have been identified in draft genome assemblies using automated annotation pipelines that are frequently based on orthologs from distantly related model organisms. Manual curation is time consuming and often requires substantial expertise, but is instrumental in improving gene model structure and identification. Manual annotation may seem to be a daunting and cost-prohibitive task for small research communities but involving undergraduates in community genome annotation consortiums can be mutually beneficial for both education and improved genomic resources. We outline a workflow for efficient manual annotation driven by a team of primarily undergraduate annotators. This model can be scaled to large teams and includes quality control processes through incremental evaluation. Moreover, it gives students an opportunity to increase their understanding of genome biology and to participate in scientific research in collaboration with peers and senior researchers at multiple institutions

    BOSC 2022: the first hybrid and 23rd annual Bioinformatics Open Source Conference

    Get PDF
    The 23 rd annual Bioinformatics Open Source Conference (BOSC 2022) was part of this year's conference on Intelligent Systems for Molecular Biology (ISMB). Launched in 2000 and held every year since, BOSC is the premier meeting covering open source bioinformatics and open science. ISMB 2022 was, for the first time, a hybrid conference, with the in-person component hosted in Madison, Wisconsin (USA). About 1000 people attended ISMB 2022 in person, with another 800 online. Approximately 200 people participated in BOSC sessions, which included 28 talks chosen from submitted abstracts, 46 posters, and a panel discussion, "Building and Sustaining Inclusive Open Science Communities". BOSC 2022 included joint keynotes with two other COSIs. Jason Williams gave a BOSC / Education COSI keynote entitled "Riding the bicycle: Including all scientists on a path to excellence". A joint session with Bio-Ontologies featured a keynote by Melissa Haendel, "The open data highway: turbo-boosting translational traffic with ontologies.

    Representing glycophenotypes: semantic unification of glycobiology resources for disease discovery.

    Get PDF
    While abnormalities related to carbohydrates (glycans) are frequent for patients with rare and undiagnosed diseases as well as in many common diseases, these glycan-related phenotypes (glycophenotypes) are not well represented in knowledge bases (KBs). If glycan-related diseases were more robustly represented and curated with glycophenotypes, these could be used for molecular phenotyping to help to realize the goals of precision medicine. Diagnosis of rare diseases by computational cross-species comparison of genotype-phenotype data has been facilitated by leveraging ontological representations of clinical phenotypes, using Human Phenotype Ontology (HPO), and model organism ontologies such as Mammalian Phenotype Ontology (MP) in the context of the Monarch Initiative. In this article, we discuss the importance and complexity of glycobiology and review the structure of glycan-related content from existing KBs and biological ontologies. We show how semantically structuring knowledge about the annotation of glycophenotypes could enhance disease diagnosis, and propose a solution to integrate glycophenotypes and related diseases into the Unified Phenotype Ontology (uPheno), HPO, Monarch and other KBs. We encourage the community to practice good identifier hygiene for glycans in support of semantic analysis, and clinicians to add glycomics to their diagnostic analyses of rare diseases

    KG-COVID-19: A Framework to Produce Customized Knowledge Graphs for COVID-19 Response.

    Get PDF
    Integrated, up-to-date data about SARS-CoV-2 and COVID-19 is crucial for the ongoing response to the COVID-19 pandemic by the biomedical research community. While rich biological knowledge exists for SARS-CoV-2 and related viruses (SARS-CoV, MERS-CoV), integrating this knowledge is difficult and time-consuming, since much of it is in siloed databases or in textual format. Furthermore, the data required by the research community vary drastically for different tasks; the optimal data for a machine learning task, for example, is much different from the data used to populate a browsable user interface for clinicians. To address these challenges, we created KG-COVID-19, a flexible framework that ingests and integrates heterogeneous biomedical data to produce knowledge graphs (KGs), and applied it to create a KG for COVID-19 response. This KG framework also can be applied to other problems in which siloed biomedical data must be quickly integrated for different research applications, including future pandemics

    Gene content evolution in the arthropods

    Get PDF
    Arthropods comprise the largest and most diverse phylum on Earth and play vital roles in nearly every ecosystem. Their diversity stems in part from variations on a conserved body plan, resulting from and recorded in adaptive changes in the genome. Dissection of the genomic record of sequence change enables broad questions regarding genome evolution to be addressed, even across hyper-diverse taxa within arthropods. Using 76 whole genome sequences representing 21 orders spanning more than 500 million years of arthropod evolution, we document changes in gene and protein domain content and provide temporal and phylogenetic context for interpreting these innovations. We identify many novel gene families that arose early in the evolution of arthropods and during the diversification of insects into modern orders. We reveal unexpected variation in patterns of DNA methylation across arthropods and examples of gene family and protein domain evolution coincident with the appearance of notable phenotypic and physiological adaptations such as flight, metamorphosis, sociality, and chemoperception. These analyses demonstrate how large-scale comparative genomics can provide broad new insights into the genotype to phenotype map and generate testable hypotheses about the evolution of animal diversity
    corecore