1,114 research outputs found

    Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions

    Full text link
    Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research.Comment: Accepted by 2023 IMIA Yearbook of Medical Informatic

    DHLP 1&2: Giraph based distributed label propagation algorithms on heterogeneous drug-related networks

    Full text link
    Background and Objective: Heterogeneous complex networks are large graphs consisting of different types of nodes and edges. The knowledge extraction from these networks is complicated. Moreover, the scale of these networks is steadily increasing. Thus, scalable methods are required. Methods: In this paper, two distributed label propagation algorithms for heterogeneous networks, namely DHLP-1 and DHLP-2 have been introduced. Biological networks are one type of the heterogeneous complex networks. As a case study, we have measured the efficiency of our proposed DHLP-1 and DHLP-2 algorithms on a biological network consisting of drugs, diseases, and targets. The subject we have studied in this network is drug repositioning but our algorithms can be used as general methods for heterogeneous networks other than the biological network. Results: We compared the proposed algorithms with similar non-distributed versions of them namely MINProp and Heter-LP. The experiments revealed the good performance of the algorithms in terms of running time and accuracy.Comment: Source code available for Apache Giraph on Hadoo

    Knowledge graphs for covid-19: An exploratory review of the current landscape

    Get PDF
    Background: Searching through the COVID-19 research literature to gain actionable clinical insight is a formidable task, even for experts. The usefulness of this corpus in terms of improving patient care is tied to the ability to see the big picture that emerges when the studies are seen in conjunction rather than in isolation. When the answer to a search query requires linking together multiple pieces of information across documents, simple keyword searches are insufficient. To answer such complex information needs, an innovative artificial intelligence (AI) technology named a knowledge graph (KG) could prove to be effective. Methods: We conducted an exploratory literature review of KG applications in the context of COVID-19. The search term used was "covid-19 knowledge graph". In addition to PubMed, the first five pages of search results for Google Scholar and Google were considered for inclusion. Google Scholar was used to include non-peer-reviewed or non-indexed articles such as pre-prints and conference proceedings. Google was used to identify companies or consortiums active in this domain that have not published any literature, peer-reviewed or otherwise. Results: Our search yielded 34 results on PubMed and 50 results each on Google and Google Scholar. We found KGs being used for facilitating literature search, drug repurposing, clinical trial mapping, and risk factor analysis. Conclusions: Our synopses of these works make a compelling case for the utility of this nascent field of research

    Knowledge-based Biomedical Data Science 2019

    Full text link
    Knowledge-based biomedical data science (KBDS) involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey the progress in the last year in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing, and the expansion of knowledge-based approaches to novel domains, such as Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages with 3 table

    Natural Language Processing for Drug Discovery Knowledge Graphs: promises and pitfalls

    Full text link
    Building and analysing knowledge graphs (KGs) to aid drug discovery is a topical area of research. A salient feature of KGs is their ability to combine many heterogeneous data sources in a format that facilitates discovering connections. The utility of KGs has been exemplified in areas such as drug repurposing, with insights made through manual exploration and modelling of the data. In this article, we discuss promises and pitfalls of using natural language processing (NLP) to mine unstructured text typically from scientific literature as a data source for KGs. This draws on our experience of initially parsing structured data sources such as ChEMBL as the basis for data within a KG, and then enriching or expanding upon them using NLP. The fundamental promise of NLP for KGs is the automated extraction of data from millions of documents a task practically impossible to do via human curation alone. However, there are many potential pitfalls in NLP-KG pipelines such as incorrect named entity recognition and ontology linking all of which could ultimately lead to erroneous inferences and conclusions.Comment: 17 pages, 7 figure

    Exposing Provenance Metadata Using Different RDF Models

    Full text link
    A standard model for exposing structured provenance metadata of scientific assertions on the Semantic Web would increase interoperability, discoverability, reliability, as well as reproducibility for scientific discourse and evidence-based knowledge discovery. Several Resource Description Framework (RDF) models have been proposed to track provenance. However, provenance metadata may not only be verbose, but also significantly redundant. Therefore, an appropriate RDF provenance model should be efficient for publishing, querying, and reasoning over Linked Data. In the present work, we have collected millions of pairwise relations between chemicals, genes, and diseases from multiple data sources, and demonstrated the extent of redundancy of provenance information in the life science domain. We also evaluated the suitability of several RDF provenance models for this crowdsourced data set, including the N-ary model, the Singleton Property model, and the Nanopublication model. We examined query performance against three commonly used large RDF stores, including Virtuoso, Stardog, and Blazegraph. Our experiments demonstrate that query performance depends on both RDF store as well as the RDF provenance model
    • …
    corecore