Search CORE

11 research outputs found

Biomedical knowledge graph-enhanced prompt generation for large language models

Author: Akbas Rabia E
Baranzini Sergio E
Cerono Gabriel
Huang Sui
Israni Sharat
Morris John H
Nelson Charlotte A
Peetoom Braian
Rizk-Jackson Angela
Rose Peter W
Shi Yongmei
Smith Brett
Soman Karthik
Villouta-Reyes Catalina
Publication venue
Publication date: 28/11/2023
Field of study

Large Language Models (LLMs) have been driving progress in AI at an unprecedented rate, yet still face challenges in knowledge-intensive domains like biomedicine. Solutions such as pre-training and domain-specific fine-tuning add substantial computational overhead, and the latter require domain-expertise. External knowledge infusion is task-specific and requires model training. Here, we introduce a task-agnostic Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging the massive biomedical KG SPOKE with LLMs such as Llama-2-13b, GPT-3.5-Turbo and GPT-4, to generate meaningful biomedical text rooted in established knowledge. KG-RAG consistently enhanced the performance of LLMs across various prompt types, including one-hop and two-hop prompts, drug repurposing queries, biomedical true/false questions, and multiple-choice questions (MCQ). Notably, KG-RAG provides a remarkable 71% boost in the performance of the Llama-2 model on the challenging MCQ dataset, demonstrating the framework's capacity to empower open-source models with fewer parameters for domain-specific questions. Furthermore, KG-RAG enhanced the performance of proprietary GPT models, such as GPT-3.5 which exhibited improvement over GPT-4 in context utilization on MCQ data. Our approach was also able to address drug repurposing questions, returning meaningful repurposing suggestions. In summary, the proposed framework combines explicit and implicit knowledge of KG and LLM, respectively, in an optimized fashion, thus enhancing the adaptability of general-purpose LLMs to tackle domain-specific questions in a unified framework.Comment: 28 pages, 5 figures, 2 tables, 1 supplementary fil

arXiv.org e-Print Archive

PatientExploreR: an extensible application for dynamic visualization of patient clinical history from electronic health records in the OMOP common data model.

Author: Attali
Atul J Butte
Badgeley
Benjamin S Glicksberg
Bethany Percha
Boris Oskotsky
Chang
Debajyoti Datta
Duke
Estiri
Eugenia Rutenberg
Frankovich
Glicksberg
Hirsch
Hripcsak
Hripcsak
Huser
Jensen
Joel T Dudley
Jonathan Wren
Kipp W Johnson
Krause
Levine
Li Li
Malik
Mandel
Marcus A Badgeley
Mark M Shervey
Nadav Rappoport
Nelson Lee
Nicholas Giangreco
Nicholas P Tatonetti
Perer
Phyllis M Thangaraj
Pivovarov
Rajkomar
Remi Frazier
Riccardo Miotto
Rick Larsen
Rind
Schuemie
Shaddox
Sharat Israni
Sievert
Soulakis
Theodore C Goldstein
Vashisht
Vivek A Rudrapatna
West
Zhang
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

MotivationElectronic health records (EHRs) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical and clinical research. Interactive and flexible software that interfaces directly with EHR data structured around a common data model (CDM) could accelerate more EHR-based research by making the data more accessible to researchers who lack computational expertise and/or domain knowledge.ResultsWe present PatientExploreR, an extensible application built on the R/Shiny framework that interfaces with a relational database of EHR data in the Observational Medical Outcomes Partnership CDM format. PatientExploreR produces patient-level interactive and dynamic reports and facilitates visualization of clinical data without any programming required. It allows researchers to easily construct and export patient cohorts from the EHR for analysis with other software. This application could enable easier exploration of patient-level data for physicians and researchers. PatientExploreR can incorporate EHR data from any institution that employs the CDM for users with approved access. The software code is free and open source under the MIT license, enabling institutions to install and users to expand and modify the application for their own purposes.Availability and implementationPatientExploreR can be freely obtained from GitHub: https://github.com/BenGlicksberg/PatientExploreR. We provide instructions for how researchers with approved access to their institutional EHR can use this package. We also release an open sandbox server of synthesized patient data for users without EHR access to explore: http://patientexplorer.ucsf.edu.Supplementary informationSupplementary data are available at Bioinformatics online

Crossref

eScholarship - University of California

Recommended from our members

A certified de-identification system for all clinical text documents for information extraction at scale.

Author: Ashouri Choshali Habibeh
Butte Atul J
Israni Sharat
Muenzen Kathleen
Oskotsky Boris
Plunkett Thomas
Radhakrishnan Lakshmi
Schenk Gundolf
Publication venue: eScholarship, University of California
Publication date: 04/07/2023
Field of study

ObjectivesClinical notes are a veritable treasure trove of information on a patient's disease progression, medical history, and treatment plans, yet are locked in secured databases accessible for research only after extensive ethics review. Removing personally identifying and protected health information (PII/PHI) from the records can reduce the need for additional Institutional Review Boards (IRB) reviews. In this project, our goals were to: (1) develop a robust and scalable clinical text de-identification pipeline that is compliant with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule for de-identification standards and (2) share routinely updated de-identified clinical notes with researchers.Materials and methodsBuilding on our open-source de-identification software called Philter, we added features to: (1) make the algorithm and the de-identified data HIPAA compliant, which also implies type 2 error-free redaction, as certified via external audit; (2) reduce over-redaction errors; and (3) normalize and shift date PHI. We also established a streamlined de-identification pipeline using MongoDB to automatically extract clinical notes and provide truly de-identified notes to researchers with periodic monthly refreshes at our institution.ResultsTo the best of our knowledge, the Philter V1.0 pipeline is currently the first and only certified, de-identified redaction pipeline that makes clinical notes available to researchers for nonhuman subjects' research, without further IRB approval needed. To date, we have made over 130 million certified de-identified clinical notes available to over 600 UCSF researchers. These notes were collected over the past 40 years, and represent data from 2757016 UCSF patients

eScholarship - University of California

Recommended from our members

A Biomedical Open Knowledge Network Harnesses the Power of AI to Understand Deep Human Biology

Author: Baranzini Sergio
Bove Riley
Börner Katy
Herr II Bruce
Huang Sui
Israni Sharat
Keiser Michael
Morris John
Musen Mark
Nelson Charlotte
Oskotsky Boris
Pearce Roger
Rankin Katherine
Reza Tahsin
Rizk-Jackson Angela
Rose Peter
Sanders Stephan
Schleimer Erica
Smith Brett
Soman Karthik
Publication venue: eScholarship, University of California
Publication date: 28/09/2023
Field of study

Knowledge representation and reasoning (KR&R) has been successfully implemented in many fields to enable computers to solve complex problems with AI methods. However, its application to biomedicine has been lagging in part due to the daunting complexity of molecular and cellular pathways that govern human physiology and pathology. In this article, we describe concrete uses of Scalable PrecisiOn Medicine Knowledge Engine (SPOKE), an open knowledge network that connects curated information from thirty-seven specialized and human-curated databases into a single property graph, with 3 million nodes and 15 million edges to date. Applications discussed in this article include drug discovery, COVID-19 research and chronic disease diagnosis, and management

eScholarship - University of California

Recommended from our members

A biomedical open knowledge network harnesses the power of AI to understand deep human biology.

Author: Baranzini Sergio E
Bove Riley
Börner Katy
Herr Bruce W
Huang Sui
Israni Sharat
Keiser Michael
Morris John
Musen Mark
Nelson Charlotte A
Oskotsky Boris
Pearce Roger
Rankin Katherine P
Reza Tahsin
Rizk-Jackson Angela
Rose Peter W
Sanders Stephan J
Schleimer Erica
Smith Brett
Soman Karthik
Publication venue: eScholarship, University of California
Publication date: 01/01/2022
Field of study

Knowledge representation and reasoning (KR&R) has been successfully implemented in many fields to enable computers to solve complex problems with AI methods. However, its application to biomedicine has been lagging in part due to the daunting complexity of molecular and cellular pathways that govern human physiology and pathology. In this article we describe concrete uses of SPOKE, an open knowledge network that connects curated information from 37 specialized and human-curated databases into a single property graph, with 3 million nodes and 15 million edges to date. Applications discussed in this article include drug discovery, COVID-19 research and chronic disease diagnosis and management

eScholarship - University of California

Recommended from our members

A biomedical open knowledge network harnesses the power of AI to understand deep human biology.

Author: Baranzini Sergio E
Bove Riley
Börner Katy
Herr Bruce W
Huang Sui
Israni Sharat
Keiser Michael
Morris John
Musen Mark
Nelson Charlotte A
Oskotsky Boris
Pearce Roger
Rankin Katherine P
Reza Tahsin
Rizk-Jackson Angela
Rose Peter W
Sanders Stephan J
Schleimer Erica
Smith Brett
Soman Karthik
Publication venue: Providence St. Joseph Health Digital Commons
Publication date: 01/01/2022
Field of study

Knowledge representation and reasoning (KR&R) has been successfully implemented in many fields to enable computers to solve complex problems with AI methods. However, its application to biomedicine has been lagging in part due to the daunting complexity of molecular and cellular pathways that govern human physiology and pathology. In this article we describe concrete uses of SPOKE, an open knowledge network that connects curated information from 37 specialized and human-curated databases into a single property graph, with 3 million nodes and 15 million edges to date. Applications discussed in this article include drug discovery, COVID-19 research and chronic disease diagnosis and management

eScholarship - University of California

Providence St. Joseph Health Digital Commons

Recommended from our members

The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information

Author: Akbas Rabia E
Baranzini Sergio E
Bharat Krish
Cerono Gabriel
Chakraborty Arjun
Costes Sylvain V
Hardi Josef
Harroud Adil
Huang Conrad C
Huang Sui
Israni Sharat
Keiser Michael
Mardirossian Taline
Meng Elaine C
Morris John H
Musen Mark
Nelson Charlotte A
Pico Alexander R
Rizk-Jackson Angela
Rose Peter W
Sanders Lauren
Schenk Gundolf
Shi Yongmei
Smith Brett
Soman Karthik
Tang Alice
Zhou Xiaoyuan
Publication venue: eScholarship, University of California
Publication date: 03/02/2023
Field of study

MotivationKnowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size and heterogeneity of the underlying information.ResultsIn this work, we present the Scalable Precision Medicine Open Knowledge Engine (SPOKE), a biomedical KG connecting millions of concepts via semantically meaningful relationships. SPOKE contains 27 million nodes of 21 different types and 53 million edges of 55 types downloaded from 41 databases. The graph is built on the framework of 11 ontologies that maintain its structure, enable mappings and facilitate navigation. SPOKE is built weekly by python scripts which download each resource, check for integrity and completeness, and then create a 'parent table' of nodes and edges. Graph queries are translated by a REST API and users can submit searches directly via an API or a graphical user interface. Conclusions/Significance: SPOKE enables the integration of seemingly disparate information to support precision medicine efforts.Availability and implementationThe SPOKE neighborhood explorer is available at https://spoke.rbvi.ucsf.edu.Supplementary informationSupplementary data are available at Bioinformatics online

eScholarship - University of California

The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information.

Author: Akbas Rabia E
Baranzini Sergio E
Bharat Krish
Cerono Gabriel
Chakraborty Arjun
Costes Sylvain V
Hardi Josef
Harroud Adil
Huang Conrad C
Huang Sui
Israni Sharat
Keiser Michael
Mardirossian Taline
Meng Elaine C
Morris John H
Musen Mark
Nelson Charlotte A
Pico Alexander R
Rizk-Jackson Angela
Rose Peter W
Sanders Lauren
Schenk Gundolf
Shi Yongmei
Smith Brett
Soman Karthik
Tang Alice
Zhou Xiaoyuan
Publication venue: Providence St. Joseph Health Digital Commons
Publication date: 03/02/2023
Field of study

MOTIVATION: Knowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size and heterogeneity of the underlying information. RESULTS: In this work, we present the Scalable Precision Medicine Open Knowledge Engine (SPOKE), a biomedical KG connecting millions of concepts via semantically meaningful relationships. SPOKE contains 27 million nodes of 21 different types and 53 million edges of 55 types downloaded from 41 databases. The graph is built on the framework of 11 ontologies that maintain its structure, enable mappings and facilitate navigation. SPOKE is built weekly by python scripts which download each resource, check for integrity and completeness, and then create a \u27parent table\u27 of nodes and edges. Graph queries are translated by a REST API and users can submit searches directly via an API or a graphical user interface. Conclusions/Significance: SPOKE enables the integration of seemingly disparate information to support precision medicine efforts. AVAILABILITY AND IMPLEMENTATION: The SPOKE neighborhood explorer is available at https://spoke.rbvi.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

Providence St. Joseph Health Digital Commons

Emerging role of artificial intelligence in cardiac electrophysiology

Artificial intelligence (AI) and machine learning (ML) have significantly impacted the field of cardiovascular medicine, especially cardiac electrophysiology (EP), on multiple fronts. The goal of this review is to familiarize readers with the field of AI and ML and their emerging role in EP. The current review is divided into 3 sections. In the first section, we discuss the definitions and basics of AI, ML, and big data. In the second section, we discuss their application to EP in the context of detection, prediction, and management of arrhythmias. Finally, we discuss the regulatory issues, challenges, and future directions of AI in EP

Directory of Open Access Journals