Search CORE

16 research outputs found

KG-Hub-building and exchanging biological knowledge graphs.

Author: Balhoff Jim
Bruskiewich Richard M
Callahan Tiffany J
Cappelletti Luca
Carbon Seth
Caufield J Harry
Chan Lauren E
Cortes Katherina
Elsarboukh Glass
Fontana Tommaso
Haendel Melissa A
Harris Nomi L
Hegde Harshad
Joachimiak Marcin P
Matentzoglu Nicolas
Moxon Sierra A T
Mungall Christopher J
Munoz-Torres Monica C
Putman Tim
Ravanmehr Vida
Reese Justin T
Robinson Peter N
Schaper Kevin
Shefchek Kent A
Thessen Anne E
Unni Deepak R
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/07/2023
Field of study

MOTIVATION: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking. RESULTS: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification. AVAILABILITY AND IMPLEMENTATION: https://kghub.org

The Jackson Laboratory: The Mouseion at the JAXlibrary

Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science.

Author: Unni Deepak R,
Publication venue
Publication date: 25/07/2022
Field of study

Ezid

Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

Author: Aditi Tayal
Biewer
Christine G. Elsik
Colin M. Diesh
Darren E. Hagen
Deepak R. Unni
Hung N. Nguyen
Kalderimis
Marianne L. Emery
Sullivan
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Recommended from our members

Apollo: Democratizing genome annotation.

Author: Diesh Colin
Dunn Nathan A
Elsik Christine G
Harris Nomi L
Holmes Ian H
Lewis Suzanna E
Munoz-Torres Monica
Rasche Helena
Unni Deepak R
Yao Eric
Publication venue: eScholarship, University of California
Publication date: 06/02/2019
Field of study

Genome annotation is the process of identifying the location and function of a genome's encoded features. Improving the biological accuracy of annotation is a complex and iterative process requiring researchers to review and incorporate multiple sources of information such as transcriptome alignments, predictive models based on sequence profiles, and comparisons to features found in related organisms. Because rapidly decreasing costs are enabling an ever-growing number of scientists to incorporate sequencing as a routine laboratory technique, there is widespread demand for tools that can assist in the deliberative analytical review of genomic information. To this end, we present Apollo, an open source software package that enables researchers to efficiently inspect and refine the precise structure and role of genomic features in a graphical browser-based platform. Some of Apollo's newer user interface features include support for real-time collaboration, allowing distributed users to simultaneously edit the same encoded features while also instantly seeing the updates made by other researchers on the same region in a manner similar to Google Docs. Its technical architecture enables Apollo to be integrated into multiple existing genomic analysis pipelines and heterogeneous laboratory workflow platforms. Finally, we consider the implications that Apollo and related applications may have on how the results of genome research are published and made accessible

eScholarship - University of California

Apollo: Democratizing genome annotation.

Author: Christine G Elsik
Colin Diesh
Deepak R Unni
Eric Yao
Helena Rasche
Ian H Holmes
Monica Munoz-Torres
Nathan A Dunn
Nomi L Harris
Suzanna E Lewis
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 04/01/2019
Field of study

Directory of Open Access Journals

eScholarship - University of California

Recommended from our members

Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science.

Author: Bada Michael
Biomedical Data Translator Consortium
Bizon Chris
Brush Matthew
Bruskiewich Richard
Caufield J Harry
Clemons Paul A
Dancik Vlado
Dumontier Michel
Fecho Karamarie
Glusman Gustavo
Hadlock Jennifer J
Haendel Melissa A
Harris Nomi L
Joshi Arpita
Moxon Sierra AT
Mungall Christopher J
Putman Tim
Qin Guangrong
Ramsey Stephen A
Shefchek Kent A
Solbrig Harold
Soman Karthik
Thessen Anne E
Unni Deepak R
Publication venue: eScholarship, University of California
Publication date: 01/08/2022
Field of study

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science

eScholarship - University of California

Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science.

Author: Bada Michael
Biomedical Data Translator Consortium
Bizon Chris
Brush Matthew
Bruskiewich Richard
Caufield J Harry
Clemons Paul A
Dancik Vlado
Dumontier Michel
Fecho Karamarie
Glusman Gustavo
Hadlock Jennifer J
Haendel Melissa A
Harris Nomi L
Joshi Arpita
Moxon Sierra A T
Mungall Christopher J
Putman Tim
Qin Guangrong
Ramsey Stephen A
Shefchek Kent A
Solbrig Harold
Soman Karthik
Thessen Anne E
Unni Deepak R
Publication venue: Providence St. Joseph Health Digital Commons
Publication date: 25/03/2022
Field of study

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these knowledge graphs (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science

arXiv.org e-Print Archive

eScholarship - University of California

Providence St. Joseph Health Digital Commons

Recommended from our members

KG-Hub—building and exchanging biological knowledge graphs

Author: Balhoff Jim
Bruskiewich Richard M
Callahan Tiffany J
Cappelletti Luca
Carbon Seth
Caufield J Harry
Chan Lauren E
Cortes Katherina
Elsarboukh Glass
Fontana Tommaso
Haendel Melissa A
Harris Nomi L
Hegde Harshad
Joachimiak Marcin P
Matentzoglu Nicolas
Moxon Sierra AT
Mungall Christopher J
Munoz-Torres Monica C
Putman Tim
Ravanmehr Vida
Reese Justin T
Robinson Peter N
Schaper Kevin
Shefchek Kent A
Thessen Anne E
Unni Deepak R
Publication venue: eScholarship, University of California
Publication date: 01/07/2023
Field of study

MotivationKnowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking.ResultsHere we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification.Availability and implementationhttps://kghub.org

eScholarship - University of California

Recommended from our members

An open source knowledge graph ecosystem for the life sciences

Author: Bada Michael
Baumgartner William A
Bennett Tellen D
Boyce Richard D
Callahan Tiffany J
Cappelletti Luca
Casiraghi Elena
Cavalleri Emanuele
Fontana Tommaso
Gillenwater Lucas A
Hoehndorf Robert
Hoyt Charles Tapley
Hripcsak George
Hunter Lawrence E
Joachimiak Marcin P
Kahn Michael G
Malec Scott A
Matentzoglu Nicolas A
Mesiti Marco
Mungall Christopher J
Reese Justin
Robinson Peter N
Ryan Patrick B
Santangelo Brook
Silverstein Jonathan C
Stefanski Adrianne L
Taneja Sanya B
Tripodi Ignacio J
Unni Deepak R
Valentini Giorgio
Vasilevsky Nicole A
Wyrwa Jordan M
Publication venue: eScholarship, University of California
Publication date: 01/04/2024
Field of study

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability

eScholarship - University of California