7,511 research outputs found

    The cellular microscopy phenotype ontology

    Get PDF
    BACKGROUND: Phenotypic data derived from high content screening is currently annotated using free-text, thus preventing the integration of independent datasets, including those generated in different biological domains, such as cell lines, mouse and human tissues. DESCRIPTION: We present the Cellular Microscopy Phenotype Ontology (CMPO), a species neutral ontology for describing phenotypic observations relating to the whole cell, cellular components, cellular processes and cell populations. CMPO is compatible with related ontology efforts, allowing for future cross-species integration of phenotypic data. CMPO was developed following a curator-driven approach where phenotype data were annotated by expert biologists following the Entity-Quality (EQ) pattern. These EQs were subsequently transformed into new CMPO terms following an established post composition process. CONCLUSION: CMPO is currently being utilized to annotate phenotypes associated with high content screening datasets stored in several image repositories including the Image Data Repository (IDR), MitoSys project database and the Cellular Phenotype Database to facilitate data browsing and discoverability

    WormBase: a comprehensive resource for nematode research

    Get PDF
    WormBase (http://www.wormbase.org) is a central data repository for nematode biology. Initially created as a service to the Caenorhabditis elegans research field, WormBase has evolved into a powerful research tool in its own right. In the past 2 years, we expanded WormBase to include the complete genomic sequence, gene predictions and orthology assignments from a range of related nematodes. This comparative data enrich the C. elegans data with improved gene predictions and a better understanding of gene function. In turn, they bring the wealth of experimental knowledge of C. elegans to other systems of medical and agricultural importance. Here, we describe new species and data types now available at WormBase. In addition, we detail enhancements to our curatorial pipeline and website infrastructure to accommodate new genomes and an extensive user base

    A systematic analysis of host factors reveals a Med23-interferon-λ regulatory axis against herpes simplex virus type 1 replication

    Get PDF
    Herpes simplex virus type 1 (HSV-1) is a neurotropic virus causing vesicular oral or genital skin lesions, meningitis and other diseases particularly harmful in immunocompromised individuals. To comprehensively investigate the complex interaction between HSV-1 and its host we combined two genome-scale screens for host factors (HFs) involved in virus replication. A yeast two-hybrid screen for protein interactions and a RNA interference (RNAi) screen with a druggable genome small interfering RNA (siRNA) library confirmed existing and identified novel HFs which functionally influence HSV-1 infection. Bioinformatic analyses found the 358 HFs were enriched for several pathways and multi-protein complexes. Of particular interest was the identification of Med23 as a strongly anti-viral component of the largely pro-viral Mediator complex, which links specific transcription factors to RNA polymerase II. The anti-viral effect of Med23 on HSV-1 replication was confirmed in gain-of-function gene overexpression experiments, and this inhibitory effect was specific to HSV-1, as a range of other viruses including Vaccinia virus and Semliki Forest virus were unaffected by Med23 depletion. We found Med23 significantly upregulated expression of the type III interferon family (IFN-λ) at the mRNA and protein level by directly interacting with the transcription factor IRF7. The synergistic effect of Med23 and IRF7 on IFN-λ induction suggests this is the major transcription factor for IFN-λ expression. Genotypic analysis of patients suffering recurrent orofacial HSV-1 outbreaks, previously shown to be deficient in IFN-λ secretion, found a significant correlation with a single nucleotide polymorphism in the IFN-λ3 (IL28b) promoter strongly linked to Hepatitis C disease and treatment outcome. This paper describes a link between Med23 and IFN-λ, provides evidence for the crucial role of IFN-λ in HSV-1 immune control, and highlights the power of integrative genome-scale approaches to identify HFs critical for disease progression and outcome

    Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curation

    Get PDF
    Background: Manual curation of experimental data from the biomedical literature is an expensive and time-consuming endeavor. Nevertheless, most biological knowledge bases still rely heavily on manual curation for data extraction and entry. Text mining software that can semi- or fully automate information retrieval from the literature would thus provide a significant boost to manual curation efforts. Results: We employ the Textpresso category-based information retrieval and extraction system http://www.textpresso.org webcite, developed by WormBase to explore how Textpresso might improve the efficiency with which we manually curate C. elegans proteins to the Gene Ontology's Cellular Component Ontology. Using a training set of sentences that describe results of localization experiments in the published literature, we generated three new curation task-specific categories (Cellular Components, Assay Terms, and Verbs) containing words and phrases associated with reports of experimentally determined subcellular localization. We compared the results of manual curation to that of Textpresso queries that searched the full text of articles for sentences containing terms from each of the three new categories plus the name of a previously uncurated C. elegans protein, and found that Textpresso searches identified curatable papers with recall and precision rates of 79.1% and 61.8%, respectively (F-score of 69.5%), when compared to manual curation. Within those documents, Textpresso identified relevant sentences with recall and precision rates of 30.3% and 80.1% (F-score of 44.0%). From returned sentences, curators were able to make 66.2% of all possible experimentally supported GO Cellular Component annotations with 97.3% precision (F-score of 78.8%). Measuring the relative efficiencies of Textpresso-based versus manual curation we find that Textpresso has the potential to increase curation efficiency by at least 8-fold, and perhaps as much as 15-fold, given differences in individual curatorial speed. Conclusion: Textpresso is an effective tool for improving the efficiency of manual, experimentally based curation. Incorporating a Textpresso-based Cellular Component curation pipeline at WormBase has allowed us to transition from strictly manual curation of this data type to a more efficient pipeline of computer-assisted validation. Continued development of curation task-specific Textpresso categories will provide an invaluable resource for genomics databases that rely heavily on manual curation
    corecore