103 research outputs found
Analysis of Cancer Omics Data In A Semantic Web Framework
Our work concerns the elucidation of the cancer (epi)genome, transcriptome and proteome to better understand the complex interplay between a cancel cell's molecular state and its response to anti-cancer therapy. To study the problem, we have previously focused on data warehousing technologies and statistical data integration. In this paper, we present recent work on extending our analytical capabilities using Semantic Web technology. A key new component presented here is a SPARQL endpoint to our existing data warehouse. This endpoint allows the merging of observed quantitative data with existing data from semantic knowledge sources such as Gene Ontology (GO). We show how such variegated quantitative and functional data can be integrated and accessed in a universal manner using Semantic Web tools. We also demonstrate how Description Lobic (DL) reasoning can be used to infer previously unstated conclusions from existing knowledge bases. As proof of concept, we illustrate the ability of our setup to answer complex queries on resistance of cancer cells to Decitabine, a demethylating agent
A semantic web framework to integrate cancer omics data with biological knowledge
BACKGROUND: The RDF triple provides a simple linguistic means of describing limitless types of information. Triples can be flexibly combined into a unified data source we call a semantic model. Semantic models open new possibilities for the integration of variegated biological data. We use Semantic Web technology to explicate high throughput clinical data in the context of fundamental biological knowledge. We have extended Corvus, a data warehouse which provides a uniform interface to various forms of Omics data, by providing a SPARQL endpoint. With the querying and reasoning tools made possible by the Semantic Web, we were able to explore quantitative semantic models retrieved from Corvus in the light of systematic biological knowledge. RESULTS: For this paper, we merged semantic models containing genomic, transcriptomic and epigenomic data from melanoma samples with two semantic models of functional data - one containing Gene Ontology (GO) data, the other, regulatory networks constructed from transcription factor binding information. These two semantic models were created in an ad hoc manner but support a common interface for integration with the quantitative semantic models. Such combined semantic models allow us to pose significant translational medicine questions. Here, we study the interplay between a cell's molecular state and its response to anti-cancer therapy by exploring the resistance of cancer cells to Decitabine, a demethylating agent. CONCLUSIONS: We were able to generate a testable hypothesis to explain how Decitabine fights cancer - namely, that it targets apoptosis-related gene promoters predominantly in Decitabine-sensitive cell lines, thus conveying its cytotoxic effect by activating the apoptosis pathway. Our research provides a framework whereby similar hypotheses can be developed easily
Complex Genetic Interactions in a Quantitative Trait Locus
Whether in natural populations or between two unrelated members of a species, most phenotypic variation is quantitative. To analyze such quantitative traits, one must first map the underlying quantitative trait loci. Next, and far more difficult, one must identify the quantitative trait genes (QTGs), characterize QTG interactions, and identify the phenotypically relevant polymorphisms to determine how QTGs contribute to phenotype. In this work, we analyzed three Saccharomyces cerevisiae high-temperature growth (Htg) QTGs (MKT1, END3, and RHO2). We observed a high level of genetic interactions among QTGs and strain background. Interestingly, while the MKT1 and END3 coding polymorphisms contribute to phenotype, it is the RHO2 3′UTR polymorphisms that are phenotypically relevant. Reciprocal hemizygosity analysis of the Htg QTGs in hybrids between S288c and ten unrelated S. cerevisiae strains reveals that the contributions of the Htg QTGs are not conserved in nine other hybrids, which has implications for QTG identification by marker-trait association. Our findings demonstrate the variety and complexity of QTG contributions to phenotype, the impact of genetic background, and the value of quantitative genetic studies in S. cerevisiae
A linked data representation for summary statistics and grouping criteria
Summary statistics are fundamental to data science, and are the buidling blocks of statistical reasoning. Most of the data and statistics made available on government web sites are aggregate, however, until now, we have not had a suitable linked data representation available. We propose a way to express summary statistics across aggregate groups as linked data using Web Ontology Language (OWL) Class based sets, where members of the set contribute to the overall aggregate value. Additionally, many clinical studies in the biomedical field rely on demographic summaries of their study cohorts and the patients assigned to each arm. While most data query languages, including SPARQL, allow for computation of summary statistics, they do not provide a way to integrate those values back into the RDF graphs they were computed from. We represent this knowledge, that would otherwise be lost, through the use of OWL 2 punning semantics, the expression of aggregate grouping criteria as OWL classes with variables, and constructs from the Semanticscience Integrated Ontology (SIO), and the World Wide Web Consortium’s provenance ontology, PROV-O, providing interoperable representations that are well supported across the web of Linked Data. We evaluate these semantics using a Resource Description Framework (RDF) representation of patient case information from the Genomic Data Commons, a data portal from the National Cancer Institute
Semantic web data warehousing for caGrid
The National Cancer Institute (NCI) is developing caGrid as a means for sharing cancer-related data and services. As more data sets become available on caGrid, we need effective ways of accessing and integrating this information. Although the data models exposed on caGrid are semantically well annotated, it is currently up to the caGrid client to infer relationships between the different models and their classes. In this paper, we present a Semantic Web-based data warehouse (Corvus) for creating relationships among caGrid models. This is accomplished through the transformation of semantically-annotated caBIG® Unified Modeling Language (UML) information models into Web Ontology Language (OWL) ontologies that preserve those semantics. We demonstrate the validity of the approach by Semantic Extraction, Transformation and Loading (SETL) of data from two caGrid data sources, caTissue and caArray, as well as alignment and query of those sources in Corvus. We argue that semantic integration is necessary for integration of data from distributed web services and that Corvus is a useful way of accomplishing this. Our approach is generalizable and of broad utility to researchers facing similar integration challenges
Making Study Populations Visible through Knowledge Graphs
Treatment recommendations within Clinical Practice Guidelines (CPGs) are
largely based on findings from clinical trials and case studies, referred to
here as research studies, that are often based on highly selective clinical
populations, referred to here as study cohorts. When medical practitioners
apply CPG recommendations, they need to understand how well their patient
population matches the characteristics of those in the study cohort, and thus
are confronted with the challenges of locating the study cohort information and
making an analytic comparison. To address these challenges, we develop an
ontology-enabled prototype system, which exposes the population descriptions in
research studies in a declarative manner, with the ultimate goal of allowing
medical practitioners to better understand the applicability and
generalizability of treatment recommendations. We build a Study Cohort Ontology
(SCO) to encode the vocabulary of study population descriptions, that are often
reported in the first table in the published work, thus they are often referred
to as Table 1. We leverage the well-used Semanticscience Integrated Ontology
(SIO) for defining property associations between classes. Further, we model the
key components of Table 1s, i.e., collections of study subjects, subject
characteristics, and statistical measures in RDF knowledge graphs. We design
scenarios for medical practitioners to perform population analysis, and
generate cohort similarity visualizations to determine the applicability of a
study population to the clinical population of interest. Our semantic approach
to make study populations visible, by standardized representations of Table 1s,
allows users to quickly derive clinically relevant inferences about study
populations.Comment: 16 pages, 4 figures, 1 table, accepted to the ISWC 2019 Resources
Track (https://iswc2019.semanticweb.org/call-for-resources-track-papers/
PLX4032, a selective BRAFV600E kinase inhibitor, activates the ERK pathway and enhances cell migration and proliferation of BRAFWT melanoma cells
BRAFV600E/K is a frequent mutationally active tumor-specific kinase in melanomas that is currently targeted for therapy by the specific inhibitor PLX4032. Our studies with melanoma tumor cells that are BRAFV600E/K and BRAFWT showed that, paradoxically, while PLX4032 inhibited ERK1/2 in the highly sensitive BRAFV600E/K, it activated the pathway in the resistant BRAFWT cells, via RAF1 activation, regardless of the status of mutations in NRAS or PTEN. The persistently active ERK1/2 triggered downstream effectors in BRAFWT melanoma cells and induced changes in the expression of a wide-spectrum of genes associated with cell cycle control. Furthermore, PLX4032 increased the rate of proliferation of growth factor-dependent NRAS Q61L mutant primary melanoma cells, reduced cell adherence and increased mobility of cells from advanced lesions. The results suggest that the drug can confer an advantage to BRAFWT primary and metastatic tumor cells in vivo and provide markers for monitoring clinical responses
The Translational Medicine Ontology and Knowledge Base: driving personalized medicine by bridging the gap between bench and bedside
Background: Translational medicine requires the integration of knowledge using heterogeneous data from health care to the life sciences. Here, we describe a collaborative effort to produce a prototype Translational Medicine Knowledge Base (TMKB) capable of answering questions relating to clinical practice and pharmaceutical drug discovery. Results: We developed the Translational Medicine Ontology (TMO) as a unifying ontology to integrate chemical, genomic and proteomic data with disease, treatment, and electronic health records. We demonstrate the use of Semantic Web technologies in the integration of patient and biomedical data, and reveal how such a knowledge base can aid physicians in providing tailored patient care and facilitate the recruitment of patients into active clinical trials. Thus, patients, physicians and researchers may explore the knowledge base to better understand therapeutic options, efficacy, and mechanisms of action. Conclusions: This work takes an important step in using Semantic Web technologies to facilitate integration of relevant, distributed, external sources and progress towards a computational platform to support personalized medicine. Availability: TMO can be downloaded from http://code.google.com/p/translationalmedicineontology and TMKB can be accessed at http://tm.semanticscience.org/sparql
- …