47 research outputs found

    The what, where, how and why of gene ontology—a primer for bioinformaticians

    Get PDF
    With high-throughput technologies providing vast amounts of data, it has become more important to provide systematic, quality annotations. The Gene Ontology (GO) project is the largest resource for cataloguing gene function. Nonetheless, its use is not yet ubiquitous and is still fraught with pitfalls. In this review, we provide a short primer to the GO for bioinformaticians. We summarize important aspects of the structure of the ontology, describe sources and types of functional annotations, survey measures of GO annotation similarity, review typical uses of GO and discuss other important considerations pertaining to the use of GO in bioinformatics applications

    Исследование раковых новообразований методами Биоинформатики в экспериментах геномного секвенирования

    Get PDF
    Chepeleva M.K., Nazarov P.V. Research of cancerous neoplasms by methods Bioinformatics in genomic sequencing experimentsСекция 5. ОБРАБОТКА СИГНАЛОВ, ИЗОБРАЖЕНИЙ И ВИДЕ

    DeepBrain: Functional Representation of Neural In-Situ Hybridization Images for Gene Ontology Classification Using Deep Convolutional Autoencoders

    Full text link
    This paper presents a novel deep learning-based method for learning a functional representation of mammalian neural images. The method uses a deep convolutional denoising autoencoder (CDAE) for generating an invariant, compact representation of in situ hybridization (ISH) images. While most existing methods for bio-imaging analysis were not developed to handle images with highly complex anatomical structures, the results presented in this paper show that functional representation extracted by CDAE can help learn features of functional gene ontology categories for their classification in a highly accurate manner. Using this CDAE representation, our method outperforms the previous state-of-the-art classification rate, by improving the average AUC from 0.92 to 0.98, i.e., achieving 75% reduction in error. The method operates on input images that were downsampled significantly with respect to the original ones to make it computationally feasible

    CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis

    Get PDF
    Ontologies, as the term is used in informatics, are structured vocabularies comprised of human- and computer-interpretable terms and relations that represent entities and relationships. Within informatics fields, ontologies play an important role in knowledge and data standardization, representation, integra- tion, sharing and analysis. They have also become a foundation of artificial intelligence (AI) research. In what follows, we outline the Coronavirus Infectious Disease Ontology (CIDO), which covers multiple areas in the domain of coronavirus diseases, including etiology, transmission, epidemiology, pathogenesis, diagnosis, prevention, and treatment. We emphasize CIDO development relevant to COVID-19

    Ortholog identification in the presence of domain architecture rearrangement

    Get PDF
    Ortholog identification is used in gene functional annotation, species phylogeny estimation, phylogenetic profile construction and many other analyses. Bioinformatics methods for ortholog identification are commonly based on pairwise protein sequence comparisons between whole genomes. Phylogenetic methods of ortholog identification have also been developed; these methods can be applied to protein data sets sharing a common domain architecture or which share a single functional domain but differ outside this region of homology. While promiscuous domains represent a challenge to all orthology prediction methods, overall structural similarity is highly correlated with proximity in a phylogenetic tree, conferring a degree of robustness to phylogenetic methods. In this article, we review the issues involved in orthology prediction when data sets include sequences with structurally heterogeneous domain architectures, with particular attention to automated methods designed for high-throughput application, and present a case study to illustrate the challenges in this area

    piRNA expression in regenerative tissue of Octopus bimaculoides

    Get PDF
    Tissue regeneration is present in varying capacities across the animal kingdom. Animals such as Hydra and planarians have the capacity to regenerate entire bodies from extremely small sections of amputated tissue. Others, such as humans, have restricted capacities of regeneration, especially in terms of full appendages and specialized tissues such as cardiac and nervous tissue. One of the primary goals of studying regeneration in other organisms is to achieve the development of regenerative medicine. Interaction of P-element induced WImpy testis (PIWI) proteins and PIWI-interacting RNAs (piRNAs) have been implicated in germline genome maintenance, as well as transposable element silencing. Research has also connected PIWI protein expression to regeneration in model organisms. Octopus bimaculoides is a marine animal of Phylum Mollusca that exhibits great regenerative abilities. In this research study, piRNA expression was examined in the regenerated tentacles of O. bimaculoides, and its somatic tissue used as the control. RepeatMasker analysis showed that piRNAs were targeting repeated elements, most notably DNA transposons, long tandem repeats (LTRs), and long interspersed nuclear elements (LINEs). Gene ontology analysis showed that piRNAs were targeting genes implicated in the regulation of transcription, cell communication, signal transduction, and intrinsic and integral components of the cell membrane

    Quality of Computationally Inferred Gene Ontology Annotations

    Get PDF
    Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotations, as most newly sequenced genomes are non-model organisms. Here, we introduce a methodology to systematically and quantitatively evaluate electronic annotations. By exploiting changes in successive releases of the UniProt Gene Ontology Annotation database, we assessed the quality of electronic annotations in terms of specificity, reliability, and coverage. Overall, we not only found that electronic annotations have significantly improved in recent years, but also that their reliability now rivals that of annotations inferred by curators when they use evidence other than experiments from primary literature. This work provides the means to identify the subset of electronic annotations that can be relied upon—an important outcome given that >98% of all annotations are inferred without direct curation

    Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium

    Get PDF
    The goal of the Gene Ontology (GO) project is to provide a uniform way to describe the functions of gene products from organisms across all kingdoms of life and thereby enable analysis of genomic data. Protein annotations are either based on experiments or predicted from protein sequences. Since most sequences have not been experimentally characterized, most available annotations need to be based on predictions. To make as accurate inferences as possible, the GO Consortium's Reference Genome Project is using an explicit evolutionary framework to infer annotations of proteins from a broad set of genomes from experimental annotations in a semi-automated manner. Most components in the pipeline, such as selection of sequences, building multiple sequence alignments and phylogenetic trees, retrieving experimental annotations and depositing inferred annotations, are fully automated. However, the most crucial step in our pipeline relies on software-assisted curation by an expert biologist. This curation tool, Phylogenetic Annotation and INference Tool (PAINT) helps curators to infer annotations among members of a protein family. PAINT allows curators to make precise assertions as to when functions were gained and lost during evolution and record the evidence (e.g. experimentally supported GO annotations and phylogenetic information including orthology) for those assertions. In this article, we describe how we use PAINT to infer protein function in a phylogenetic context with emphasis on its strengths, limitations and guidelines. We also discuss specific examples showing how PAINT annotations compare with those generated by other highly used homology-based methods
    corecore