2,204 research outputs found

    A case study in pathway knowledgebase verification

    Get PDF
    BACKGROUND: Biological databases and pathway knowledgebases are proliferating rapidly. We are developing software tools for computer-aided hypothesis design and evaluation, and we would like our tools to take advantage of the information stored in these repositories. But before we can reliably use a pathway knowledgebase as a data source, we need to proofread it to ensure that it can fully support computer-aided information integration and inference. RESULTS: We design a series of logical tests to detect potential problems we might encounter using a particular knowledgebase, the Reactome database, with a particular computer-aided hypothesis evaluation tool, HyBrow. We develop an explicit formal language from the language implicit in the Reactome data format and specify a logic to evaluate models expressed using this language. We use the formalism of finite model theory in this work. We then use this logic to formulate tests for desirable properties (such as completeness, consistency, and well-formedness) for pathways stored in Reactome. We apply these tests to the publicly available Reactome releases (releases 10 through 14) and compare the results, which highlight Reactome's steady improvement in terms of decreasing inconsistencies. We also investigate and discuss Reactome's potential for supporting computer-aided inference tools. CONCLUSION: The case study described in this work demonstrates that it is possible to use our model theory based approach to identify problems one might encounter using a knowledgebase to support hypothesis evaluation tools. The methodology we use is general and is in no way restricted to the specific knowledgebase employed in this case study. Future application of this methodology will enable us to compare pathway resources with respect to the generic properties such resources will need to possess if they are to support automated reasoning

    Developing Network-Based Systems Toxicology by Combining Transcriptomics Data with Literature Mining and Multiscale Quantitative Modeling

    Get PDF
    We describe how the genome-wide transcriptional profiling can be used in network-based systems toxicology, an approach leveraging biological networks for assessing the health risks of exposure to chemical compounds. Driven by the technological advances changing the ways in which data are generated, systems toxicology has allowed traditional toxicity endpoints to be enhanced with far deeper levels of analysis. In combination, new experimental and computational methods have offered the potential for more effective, efficient, and reliable toxicological testing strategies. We illustrate these advances by the “network perturbation amplitude” methodology that quantifies the effects of exposure treatments on biological mechanisms represented by causal networks. We also describe recent developments in the assembly of high-quality causal biological networks using crowdsourcing and text-mining approaches. We further show how network-based approaches can be integrated into the multiscale modeling framework of response to toxicological exposure. Finally, we combine biological knowledge assembly and multiscale modeling to report on the promising developments of the “quantitative adverse outcome pathway” concept, which spans multiple levels of biological organization, from molecules to population, and has direct relevance in the context of the “Toxicity Testing in the 21st century” vision of the US National Research Council

    Monitoring and analysis of data from complex systems

    Get PDF
    Some of the methods, systems, and prototypes that have been tested for monitoring and analyzing the data from several spacecraft and vehicles at the Marshall Space Flight Center are introduced. For the Huntsville Operations Support Center (HOSC) infrastructure, the Marshall Integrated Support System (MISS) provides a migration path to the state-of-the-art workstation environment. Its modular design makes it possible to implement the system in stages on multiple platforms without the need for all components to be in place at once. The MISS provides a flexible, user-friendly environment for monitoring and controlling orbital payloads. In addition, new capabilities and technology may be incorporated into MISS with greater ease. The use of information systems technology in advanced prototype phases, as adjuncts to mainline activities, is used to evaluate new computational techniques for monitoring and analysis of complex systems. Much of the software described (specially, HSTORESIS (Hubble Space Telescope Operational Readiness Expert Safemode Investigation System), DRS (Device Reasoning Shell), DART (Design Alternatives Rational Tool), elements of the DRA (Document Retrieval Assistant), and software for the PPS (Peripheral Processing System) and the HSPP (High-Speed Peripheral Processor)) is available with supporting documentation, and may be applicable to other system monitoring and analysis applications

    Ontology-based instance data validation for high-quality curated biological pathways

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modeling in systems biology is vital for understanding the complexity of biological systems across scales and predicting system-level behaviors. To obtain high-quality pathway databases, it is essential to improve the efficiency of model validation and model update based on appropriate feedback.</p> <p>Results</p> <p>We have developed a new method to guide creating novel high-quality biological pathways, using a rule-based validation. Rules are defined to correct models against biological semantics and improve models for dynamic simulation. In this work, we have defined 40 rules which constrain event-specific participants and the related features and adding missing processes based on biological events. This approach is applied to data in Cell System Ontology which is a comprehensive ontology that represents complex biological pathways with dynamics and visualization. The experimental results show that the relatively simple rules can efficiently detect errors made during curation, such as misassignment and misuse of ontology concepts and terms in curated models.</p> <p>Conclusions</p> <p>A new rule-based approach has been developed to facilitate model validation and model complementation. Our rule-based validation embedding biological semantics enables us to provide high-quality curated biological pathways. This approach can serve as a preprocessing step for model integration, exchange and extraction data, and simulation.</p

    The Gene Ontology Resource: 20 years and still GOing strong

    Get PDF
    The Gene Ontology resource (GO; http://geneontology.org) provides structured, computable knowledge regarding the functions of genes and gene products. Founded in 1998, GO has become widely adopted in the life sciences, and its contents are under continual improvement, both in quantity and in quality. Here, we report the major developments of the GO resource during the past two years. Each monthly release of the GO resource is now packaged and given a unique identifier (DOI), enabling GO-based analyses on a specific release to be reproduced in the future. The molecular function ontology has been refactored to better represent the overall activities of gene products, with a focus on transcription regulator activities. Quality assurance efforts have been ramped up to address potentially out-of-date or inaccurate annotations. New evidence codes for high-throughput experiments now enable users to filter out annotations obtained from these sources. GO-CAM, a new framework for representing gene function that is more expressive than standard GO annotations, has been released, and users can now explore the growing repository of these models. We also provide the ‘GO ribbon’ widget for visualizing GO annotations to a gene; the widget can be easily embedded in any web page

    The Gene Ontology Resource: 20 years and still GOing strong.

    Get PDF
    The Gene Ontology resource (GO; http://geneontology.org) provides structured, computable knowledge regarding the functions of genes and gene products. Founded in 1998, GO has become widely adopted in the life sciences, and its contents are under continual improvement, both in quantity and in quality. Here, we report the major developments of the GO resource during the past two years. Each monthly release of the GO resource is now packaged and given a unique identifier (DOI), enabling GO-based analyses on a specific release to be reproduced in the future. The molecular function ontology has been refactored to better represent the overall activities of gene products, with a focus on transcription regulator activities. Quality assurance efforts have been ramped up to address potentially out-of-date or inaccurate annotations. New evidence codes for high-throughput experiments now enable users to filter out annotations obtained from these sources. GO-CAM, a new framework for representing gene function that is more expressive than standard GO annotations, has been released, and users can now explore the growing repository of these models. We also provide the \u27GO ribbon\u27 widget for visualizing GO annotations to a gene; the widget can be easily embedded in any web page

    Enhancement of COPD Biological Networks Using a Web-Based Collaboration Interface

    Get PDF
    The construction and application of biological network models is an approach that offers a holistic way to understand biological processes involved in disease. Chronic obstructive pulmonary disease (COPD) is a progressive inflammatory disease of the airways for which therapeutic options currently are limited after diagnosis, even in its earliest stage. COPD network models are important tools to better understand the biological components and processes underlying initial disease development. With the increasing amounts of literature that are now available, crowdsourcing approaches offer new forms of collaboration for researchers to review biological findings, which can be applied to the construction and verification of complex biological networks. We report the construction of 50 biological network models relevant to lung biology and early COPD using an integrative systems biology and collaborative crowd-verification approach. By combining traditional literature curation with a data-driven approach that predicts molecular activities from transcriptomics data, we constructed an initial COPD network model set based on a previously published non-diseased lung-relevant model set. The crowd was given the opportunity to enhance and refine the networks on a website (https://bionet.sbvimprover.com/) and to add mechanistic detail, as well as critically review existing evidence and evidence added by other users, so as to enhance the accuracy of the biological representation of the processes captured in the networks. Finally, scientists and experts in the field discussed and refined the networks during an in-person jamboree meeting. Here, we describe examples of the changes made to three of these networks: Neutrophil Signaling, Macrophage Signaling, and Th1-Th2 Signaling. We describe an innovative approach to biological network construction that combines literature and data mining and a crowdsourcing approach to generate a comprehensive set of COPD-relevant models that can be used to help understand the mechanisms related to lung pathobiology. Registered users of the website can freely browse and download the networks

    High-Throughput Screening Data Interpretation in the Context of In Vivo Transcriptomic Responses to Oral Cr(VI) Exposure

    Get PDF
    The toxicity of hexavalent chromium [Cr(VI)] in drinking water has been studied extensively, and available in vivo and in vitro studies provide a robust dataset for application of advanced toxicological tools to inform the mode of action (MOA). This study aimed to contribute to the understanding of Cr(VI) MOA by evaluating high-throughput screening (HTS) data and other in vitro data relevant to Cr(VI), and comparing these findings to robust in vivo data, including transcriptomic profiles in target tissues. Evaluation of Tox21 HTS data for Cr(VI) identified 11 active assay endpoints relevant to the Ten Key Characteristics of Carcinogens (TKCCs) that have been proposed by other investigators. Four of these endpoints were related to TP53 (tumor protein 53) activation mapping to genotoxicity (KCC#2), and four were related to cell death/proliferation (KCC#10). HTS results were consistent with other in vitro data from the Comparative Toxicogenomics Database. In vitro responses were compared to in vivo transcriptomic responses in the most sensitive target tissue, the duodenum, of mice exposed to ≤ 180 ppm Cr(VI) for 7 and 90 days. Pathways that were altered both in vitro and in vivo included those relevant to cell death/proliferation. In contrast, pathways relevant to p53/DNA damage were identified in vitro but not in vivo. Benchmark dose modeling and phenotypic anchoring of in vivo transcriptomic responses strengthened the finding that Cr(VI) causes cell stress/injury followed by proliferation in the mouse duodenum at high doses. These findings contribute to the body of evidence supporting a non-mutagenic MOA for Cr(VI)-induced intestinal cancer

    Evaluation of large language models for discovery of gene set function

    Full text link
    Gene set analysis is a mainstay of functional genomics, but it relies on manually curated databases of gene functions that are incomplete and unaware of biological context. Here we evaluate the ability of OpenAI's GPT-4, a Large Language Model (LLM), to develop hypotheses about common gene functions from its embedded biomedical knowledge. We created a GPT-4 pipeline to label gene sets with names that summarize their consensus functions, substantiated by analysis text and citations. Benchmarking against named gene sets in the Gene Ontology, GPT-4 generated very similar names in 50% of cases, while in most remaining cases it recovered the name of a more general concept. In gene sets discovered in 'omics data, GPT-4 names were more informative than gene set enrichment, with supporting statements and citations that largely verified in human review. The ability to rapidly synthesize common gene functions positions LLMs as valuable functional genomics assistants

    Context-Aware Verification of DMN

    Get PDF
    The Decision Model and Notation (DMN) standard is a user-friendly notation for decision logic. To verify correctness of DMN decision tables, many tools are available. However, most of these look at a table in isolation, with little or no regards for its context. In this work, we argue for the importance of context, and extend the formal verification criteria to include it. We identify two forms of context, namely in-model context and background knowledge. We also present our own context-aware verification tool, implemented in our DMN-IDP interface, and show that this context-aware approach allows us to perform more thorough verification than any other available tool
    corecore