11 research outputs found

    Special issue on bio-ontologies and phenotypes

    Get PDF
    The bio-ontologies and phenotypes special issue includes eight papers selected from the 11 papers presented at the Bio-Ontologies SIG (Special Interest Group) and the Phenotype Day at ISMB (Intelligent Systems for Molecular Biology) conference in Boston in 2014. The selected papers span a wide range of topics including the automated re-use and update of ontologies, quality assessment of ontological resources, and the systematic description of phenotype variation, driven by manual, semi- and fully automatic means

    Special issue on bio-ontologies and phenotypes

    Full text link

    Pathway level subtyping identifies a slow-cycling biological phenotype associated with poor clinical outcomes in colorectal cancer

    Get PDF
    Molecular stratification using gene-level transcriptional data has identified subtypes with distinctive genotypic and phenotypic traits, as exemplified by the consensus molecular subtypes (CMS) in colorectal cancer (CRC). Here, rather than gene-level data, we make use of gene ontology and biological activation state information for initial molecular class discovery. In doing so, we defined three pathway-derived subtypes (PDS) in CRC: PDS1 tumors, which are canonical/LGR5+ stem-rich, highly proliferative and display good prognosis; PDS2 tumors, which are regenerative/ANXA1+ stem-rich, with elevated stromal and immune tumor microenvironmental lineages; and PDS3 tumors, which represent a previously overlooked slow-cycling subset of tumors within CMS2 with reduced stem populations and increased differentiated lineages, particularly enterocytes and enteroendocrine cells, yet display the worst prognosis in locally advanced disease. These PDS3 phenotypic traits are evident across numerous bulk and single-cell datasets, and demark a series of subtle biological states that are currently under-represented in pre-clinical models and are not identified using existing subtyping classifiers

    Genome-wide transcriptome induced in osteoclast-like cells differentiated on three different hard tissues

    Get PDF
    Hard tissue resorption is a multistep process, which requires a complex interaction between multinucleated clast cells and the hard tissues to be resorbed. This process is regulated by genetic factors and inflammatory cytokines. This study was directed toward the clast cells to clarify the ability of these cells to distinguish between the resorbed substrates and the possible influence of the resorbed tissues on the intracellular mechanism behind the resorption process. To study the influence of the mineralized tissues on the differentiation and activity of the clast cells, macrophage cells were cultured on powder substrates made from bone, dentine and cementum tissues. After cell differentiation, ribonucleic acid was isolated and gene expression induced by the three hard tissues was analyzed based on RNA-Sequencing. The results showed many differentially expressed gene between the samples of the study.1930 genes were significantly differentially regulated on cementum compared with the cells stimulated on glass, 446 between bone tissue and the positive control 87 between dentin and control samples. Moreover, the comparison between the stimulated cells on the hard tissues showed 314 differentially regulated genes between cementum and dentine, 252 between bone and cementum and just one significantly differentially regulated gene between dentine and bone. These results reflect the influence of the hard substrates on the differentiation and activity of osteoclast-like cells. For example, cathepsin k gene was downregulated by 3.2 folds in the bone samples and in the cementum samples by 4.8 folds compared to the cells differentiated on glass but in the dentin groups there wasnā€™t any significant difference in gene expression. Comparing bone and dentine samples it was upregulated by 3.8 folds in the dentine samples and was downregulated by 1.5 folds in the cementum samples when compared with bone samples. In the comparison between cementum and dentin samples cathepsin k was upregulated in dentine samples by 11.3 folds. Some possible candidates were identified like Inhibitor of differentiation1 gene which was upregulated only in cementum samples in comparison with cells differentiated on glass and upregulated by 3.5 folds in cementum samples compared with bone samples. In comparison with dentin this gene was downregulated in dentin samples by 2.7 folds and didnĀ“t show any significant change between bone and dentin samples. The present study expands our understanding of the interaction between clast cells and mineralized tissues and indicates new possible target genes of relevance for diagnostic and therapeutic strategies

    COMPUTATIONAL TOOLS FOR THE DYNAMIC CATEGORIZATION AND AUGMENTED UTILIZATION OF THE GENE ONTOLOGY

    Get PDF
    Ontologies provide an organization of language, in the form of a network or graph, which is amenable to computational analysis while remaining human-readable. Although they are used in a variety of disciplines, ontologies in the biomedical field, such as Gene Ontology, are of interest for their role in organizing terminology used to describeā€”among other conceptsā€”the functions, locations, and processes of genes and gene-products. Due to the consistency and level of automation that ontologies provide for such annotations, methods for finding enriched biological terminology from a set of differentially identified genes in a tissue or cell sample have been developed to aid in the elucidation of disease pathology and unknown biochemical pathways. However, despite their immense utility, biomedical ontologies have significant limitations and caveats. One major issue is that gene annotation enrichment analyses often result in many redundant, individually enriched ontological terms that are highly specific and weakly justified by statistical significance. These large sets of weakly enriched terms are difficult to interpret without manually sorting into appropriate functional or descriptive categories. Also, relationships that organize the terminology within these ontologies do not contain descriptions of semantic scoping or scaling among terms. Therefore, there exists some ambiguity, which complicates the automation of categorizing terms to improve interpretability. We emphasize that existing methods enable the danger of producing incorrect mappings to categories as a result of these ambiguities, unless simplified and incomplete versions of these ontologies are used which omit problematic relations. Such ambiguities could have a significant impact on term categorization, as we have calculated upper boundary estimates of potential false categorizations as high as 121,579 for the misinterpretation of a single scoping relation, has_part, which accounts for approximately 18% of the total possible mappings between terms in the Gene Ontology. However, the omission of problematic relationships results in a significant loss of retrievable information. In the Gene Ontology, this accounts for a 6% reduction for the omission of a single relation. However, this percentage should increase drastically when considering all relations in an ontology. To address these issues, we have developed methods which categorize individual ontology terms into broad, biologically-related concepts to improve the interpretability and statistical significance of gene-annotation enrichment studies, meanwhile addressing the lack of semantic scoping and scaling descriptions among ontological relationships so that annotation enrichment analyses can be performed across a more complete representation of the ontological graph. We show that, when compared to similar term categorization methods, our method produces categorizations that match hand-curated ones with similar or better accuracy, while not requiring the user to compile lists of individual ontology term IDs. Furthermore, our handling of problematic relations produces a more complete representation of ontological information from a scoping perspective, and we demonstrate instances where medically-relevant terms--and by extension putative gene targets--are identified in our annotation enrichment results that would be otherwise missed when using traditional methods. Additionally, we observed a marginal, yet consistent improvement of statistical power in enrichment results when our methods were used, compared to traditional enrichment analyses that utilize ontological ancestors. Finally, using scalable and reproducible data workflow pipelines, we have applied our methods to several genomic, transcriptomic, and proteomic collaborative projects

    Pathway level subtyping identifies a slow-cycling biological phenotype associated with poor clinical outcomes in colorectal cancer

    Get PDF
    Molecular stratification using gene-level transcriptional data has identified subtypes with distinctive genotypic and phenotypic traits, as exemplified by the consensus molecular subtypes (CMS) in colorectal cancer (CRC). Here, rather than gene-level data, we make use of gene ontology and biological activation state information for initial molecular class discovery. In doing so, we defined three pathway-derived subtypes (PDS) in CRC: PDS1 tumors, which are canonical/LGR5+ stem-rich, highly proliferative and display good prognosis; PDS2 tumors, which are regenerative/ANXA1+ stem-rich, with elevated stromal and immune tumor microenvironmental lineages; and PDS3 tumors, which represent a previously overlooked slow-cycling subset of tumors within CMS2 with reduced stem populations and increased differentiated lineages, particularly enterocytes and enteroendocrine cells, yet display the worst prognosis in locally advanced disease. These PDS3 phenotypic traits are evident across numerous bulk and single-cell datasets, and demark a series of subtle biological states that are currently under-represented in pre-clinical models and are not identified using existing subtyping classifiers

    Psoriasis : from transcriptome to miRNA function and biomarkers

    Get PDF
    Psoriasis is a chronic inflammatory, immune-mediated skin condition that affects in average 2 to 3% of the world population, phenotypically characterized by red and scaly plaques on the skin of affected patients. It is a multifactorial disorder, in which both genetic predisposition and environment play key roles. Psoriasis lesional skin is characterized by abnormal keratinocyte differentiation and proliferation, as well as dermal immune cell infiltration. Psoriasis is associated with several comorbidities, e.g. arthritis, however, currently no biomarkers exist that could be used to predict or identify these at an early stage. Many studies aimed to characterize the psoriasis transcriptome, but few studies have been focusing on elucidating the gene alterations in keratinocytes in this disease. In this thesis, we explored the transcriptomic landscape of epidermal cells from lesional and non-lesional skin of patients with psoriasis, as well as from healthy volunteersā€™ skin and investigated the biomarker-potential of circulating microRNAs. In our first study, we investigated the alterations of the protein-coding transcriptome in the psoriasis epidermal compartment. The separation of the epidermis from the dermis and sorting for CD45-neg cells allowed us to exclude dermal signatures including those from fibroblasts, endothelial cells, dendritic cells and T cells, but also from immune cells infiltrating the epidermis, known to populate at increased ratio the psoriasis lesional skin. We have identified biological pathways related to immune responses, cell cycle and keratinization involved in the epidermal alterations, as well as the enrichment and dominance of psoriasis-associated cytokine signatures. Moreover, we established that genetic variations associated with psoriasis may contribute to the keratinocyte transcriptomic changes in the disease. In our second study, we investigated the alterations of the non-protein-coding transcriptome in psoriasis and identified a set of long non-coding RNAs differentially expressed in psoriasis epidermal cells. Several had genomic localization overlapping psoriasis-associated SNPs, suggesting their potential implication in the genetic susceptibility to psoriasis. We validated the over-expression of the lncRNA LINC00958 in CD45-neg cells from psoriasis lesions compared to non-lesional and healthy skin and determined its expression in different skin cell types and subcellular localization. In our third study, we focused on psoriatic arthritis, the major psoriasis comorbidity, affecting about 1/3 of the patients with cutaneous psoriasis. In particular, we investigated the potential of circulating microRNAs as biomarkers for early diagnosis of psoriatic arthritis symptoms in patients with cutaneous psoriasis. We have identified two circulating microRNAs, let-7b-5p and miR-30e-5p, with significantly reduced levels in plasma-derived extracellular vesicles of patients with confirmed psoriatic arthritis, compared to cutaneous-only psoriasis patients. Finally, in our fourth study, we investigated the role and functions of miR-378a, previously found overexpressed in psoriasis lesional keratinocytes compared to non-lesional and healthy skin. In vivo, in a mouse model of psoriasis-like skin inflammation, the injection of miR-378a resulted in increased clinical signs of inflammation, increased skin thickness and number of proliferating cells in the epidermis. In vitro, in cultured primary human keratinocytes, miR-378a overexpression enhanced the expression of pro-inflammatory chemokines CXCL8/IL8 and CCL20, as well as reduction of NFKBIA proteins levels

    INTEROPERABILITY IN TOXICOLOGY: CONNECTING CHEMICAL, BIOLOGICAL, AND COMPLEX DISEASE DATA

    Get PDF
    The current regulatory framework in toxicology is expanding beyond traditional animal toxicity testing to include new approach methodologies (NAMs) like computational models built using rapidly generated dose-response information like US Environmental Protection Agencyā€™s Toxicity Forecaster (ToxCast) and the interagency collaborative Tox21 initiative. These programs have provided new opportunities for research but also introduced challenges in application of this information to current regulatory needs. One such challenge is linking in vitro chemical bioactivity to adverse outcomes like cancer or other complex diseases. To utilize NAMs in prediction of complex disease, information from traditional and new sources must be interoperable for easy integration. The work presented here describes the development of a bioinformatic tool, a database of traditional toxicity information with improved interoperability, and efforts to use these new tools together to inform prediction of cancer and complex disease. First, a bioinformatic tool was developed to provide a ranked list of Medical Subject Heading (MeSH) to gene associations based on literature support, enabling connection of complex diseases to genes potentially involved. Second, a seminal resource of traditional toxicity information, Toxicity Reference Database (ToxRefDB), was redeveloped, including a controlled vocabulary for adverse events used to map identifiers in the Unified Medical Language System (UMLS), thus enabling a connection to MeSH terms. Finally, gene to MeSH associations were used to evaluate the biological coverage of ToxCast for cancer to understand the capacity to use ToxCast to identify chemical hazard potential. ToxCast covers many gene targets putatively linked to cancer; however, more information on pathways in cancer progression is needed to identify robust associations between chemical exposure and risk of complex disease. The findings herein demonstrate that increased interoperability between data resources is necessary to leverage the large amount of data currently available to understand the role environmental exposures play in etiologies of complex diseases.Doctor of Philosoph

    Linking gene expression to phenotypes via pathway information

    Get PDF
    Establishing robust links among gene expression, pathways and phenotypes is critical for understanding diseases and developing treatments. In recent years there have been many efforts to develop the computational means to traverse from genes to gene expression, model pathways and classify phenotypes. Numerous ontologies and other controlled vocabularies have been developed, as well as computational methods to combine and mine these data sets and establish connections. Here we discuss these efforts and identify areas of future work that could lead to a better integration of genes, pathways and phenotypes to provide insights into the mechanisms under which gene mutations affect expression and pathways and how these effects are manifested onto the phenotype. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-015-0013-5) contains supplementary material, which is available to authorized users
    corecore