COMPUTATIONAL PHENOTYPING AND DRUG REPURPOSING FROM ELECTRONIC MEDICAL RECORDS

Abstract

Using electronic medical records (EMR) for research involves selecting cohorts and manipulating data for tasks like predictive analysis. Computational phenotyping for cohort characterization and stratification is becoming increasingly important for researchers to produce clinically relevant findings. There are significant amounts of time and effort devoted to manual chart abstraction by subject matter experts and researchers, which creates a large bottleneck for progress in clinical research. I focus on developing computational phenotyping pipelines, and I also focus on using EMR for drug repurposing in breast cancer. Drug repurposing is defined as the process of applying known drugs that are already on the market to new disease indications. Using EMR data for drug repurposing has the unique advantage of being able to observe a patient cohort over time and see drug effects on outcomes. In this dissertation, I present work on computational phenotyping and EMR-based drug repurposing. First, I use embedding models and foundational natural language processing methods to predict oral cancer risk with pathology notes. Second, I use natural language processing methods and transfer learning for breast cancer cohort selection and information extraction. Third, I present a pipeline for producing drug repurposing candidates from EMR and provide supporting evidence for predictions with biomedical literature and existing clinical trials.Doctor of Philosoph

    Similar works