Search CORE

56 research outputs found

An analytical approach to characterize morbidity profile dissimilarity between distinct cohorts using electronic medical records

Author: Basford Melissa A.
Carrell David
Chute Christopher G.
Denny Joshua C.
Kho Abel
Kullo Iftikhar J.
Masys Daniel R.
Peissig Peggy
Pulley Jill M.
Roden Dan M.
Schildcrout Jonathan S.
Wang Deede
Publication venue: Elsevier Inc.
Publication date: 01/12/2010
Field of study

AbstractWe describe a two-stage analytical approach for characterizing morbidity profile dissimilarity among patient cohorts using electronic medical records. We capture morbidities using the International Statistical Classification of Diseases and Related Health Problems (ICD-9) codes. In the first stage of the approach separate logistic regression analyses for ICD-9 sections (e.g., “hypertensive disease” or “appendicitis”) are conducted, and the odds ratios that describe adjusted differences in prevalence between two cohorts are displayed graphically. In the second stage, the results from ICD-9 section analyses are combined into a general morbidity dissimilarity index (MDI). For illustration, we examine nine cohorts of patients representing six phenotypes (or controls) derived from five institutions, each a participant in the electronic MEdical REcords and GEnomics (eMERGE) network. The phenotypes studied include type II diabetes and type II diabetes controls, peripheral arterial disease and peripheral arterial disease controls, normal cardiac conduction as measured by electrocardiography, and senile cataracts

Elsevier - Publisher Connector

PubMed Central

De-black-boxing health AI: demonstrating reproducible machine learning computable phenotypes using the N3C-RECOVER Long COVID model in the All of Us data repository

Author: Basford Melissa
Chute Christopher G
Crosskey Miles
Gangireddy Srushti
Girvin Andrew T
Haendel Melissa
Harris Paul A
Kerchberger V Eric
Lunt Chris
Master Hiral
Moffitt Richard A
N3C and RECOVER Consortia
Pfaff Emily R
Wei Wei-Qi
Weiner Mark
Publication venue: Oxford University Press
Publication date: 01/01/2023
Field of study

Machine learning (ML)-driven computable phenotypes are among the most challenging to share and reproduce. Despite this difficulty, the urgent public health considerations around Long COVID make it especially important to ensure the rigor and reproducibility of Long COVID phenotyping algorithms such that they can be made available to a broad audience of researchers. As part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, researchers with the National COVID Cohort Collaborative (N3C) devised and trained an ML-based phenotype to identify patients highly probable to have Long COVID. Supported by RECOVER, N3C and NIH’s All of Us study partnered to reproduce the output of N3C’s trained model in the All of Us data enclave, demonstrating model extensibility in multiple environments. This case study in ML-based phenotype reuse illustrates how open-source software best practices and cross-site collaboration can de-black-box phenotyping algorithms, prevent unnecessary rework, and promote open science in informatics

Carolina Digital Repository

Recommended from our members

Electronic Health Record Based Algorithm to Identify Patients with Autism Spectrum Disorder

Author: Abrams Debra
Barbaresi William
Basford Melissa
Bickel Julie
Bing Nicole
Bochenek Joseph
Chen Pei
Cobb Beth A.
Connolly John
Denny Joshua
Doshi-Velez Finale
Hakonarson Hakon
Harley John
Holm Ingrid A.
Kohane Isaac S.
Lingren Nataline
Lingren Todd
Manning-Courtney Patty
Mentch Frank
Namjou Bahram
Ni Yizhao
Perry Cassandra
Qiu Haijun
Reinhold Judy
Savova Guergana
Solti Imre
Vazquez Lyam
Wildenger Welchons Leah
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 29/07/2016
Field of study

Objective: Cohort selection is challenging for large-scale electronic health record (EHR) analyses, as International Classification of Diseases 9th edition (ICD-9) diagnostic codes are notoriously unreliable disease predictors. Our objective was to develop, evaluate, and validate an automated algorithm for determining an Autism Spectrum Disorder (ASD) patient cohort from EHR. We demonstrate its utility via the largest investigation to date of the co-occurrence patterns of medical comorbidities in ASD. Methods: We extracted ICD-9 codes and concepts derived from the clinical notes. A gold standard patient set was labeled by clinicians at Boston Children’s Hospital (BCH) (N = 150) and Cincinnati Children’s Hospital and Medical Center (CCHMC) (N = 152). Two algorithms were created: (1) rule-based implementing the ASD criteria from Diagnostic and Statistical Manual of Mental Diseases 4th edition, (2) predictive classifier. The positive predictive values (PPV) achieved by these algorithms were compared to an ICD-9 code baseline. We clustered the patients based on grouped ICD-9 codes and evaluated subgroups. Results: The rule-based algorithm produced the best PPV: (a) BCH: 0.885 vs. 0.273 (baseline); (b) CCHMC: 0.840 vs. 0.645 (baseline); (c) combined: 0.864 vs. 0.460 (baseline). A validation at Children’s Hospital of Philadelphia yielded 0.848 (PPV). Clustering analyses of comorbidities on the three-site large cohort (N = 20,658 ASD patients) identified psychiatric, developmental, and seizure disorder clusters. Conclusions: In a large cross-institutional cohort, co-occurrence patterns of comorbidities in ASDs provide further hypothetical evidence for distinct courses in ASD. The proposed automated algorithms for cohort selection open avenues for other large-scale EHR studies and individualized treatment of ASD

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

FigShare

Admixture mapping and subsequent fine-mapping suggests a biologically relevant and novel association on chromosome 11 for type 2 diabetes in African Americans.

Author: Abel N Kho
Dan M Roden
Dana C Crawford
Janina M Jeff
Jennifer A Pacheco
Joshua C Denny
Loren L Armstrong
M Geoffrey Hayes
Marylyn D Ritchie
Melissa A Basford
Rex L Chisholm
Rongling Li
Wendy A Wolf
Publication venue: Public Library of Science (PLoS)
Publication date: 03/03/2014
Field of study

Type 2 diabetes (T2D) is a complex metabolic disease that disproportionately affects African Americans. Genome-wide association studies (GWAS) have identified several loci that contribute to T2D in European Americans, but few studies have been performed in admixed populations. We first performed a GWAS of 1,563 African Americans from the Vanderbilt Genome-Electronic Records Project and Northwestern University NUgene Project as part of the electronic Medical Records and Genomics (eMERGE) network. We successfully replicate an association in TCF7L2, previously identified by GWAS in this African American dataset. We were unable to identify novel associations at p<5.0×10(-8) by GWAS. Using admixture mapping as an alternative method for discovery, we performed a genome-wide admixture scan that suggests multiple candidate genes associated with T2D. One finding, TCIRG1, is a T-cell immune regulator expressed in the pancreas and liver that has not been previously implicated for T2D. We performed subsequent fine-mapping to further assess the association between TCIRG1 and T2D in >5,000 African Americans. We identified 13 independent associations between TCIRG1, CHKA, and ALDH3B1 genes on chromosome 11 and T2D. Our results suggest a novel region on chromosome 11 identified by admixture mapping is associated with T2D in African Americans

Directory of Open Access Journals

PubMed Central

Aligning Social Services for Managed Care

Author: Abramovitz M.
Albrecht Karl
Barnhart Gordon
Cotton Kathleen
Gemignani J.
Godfrey Blanton
Goldsmith J. C.
James Mason
Linda A. Galvin
Luttman Robert J.
Melissa Basford
Monk D.
Motenko A. K.
W. R. Cozens
Publication venue: 'The Haworth Press'
Publication date
Field of study

Crossref

Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis

Author: Alexander G. Bick
Anjene Musick
Anthony A. Philippakis
Chris Lunt
Dan M. Roden
David Glazer
Henry Robert Condon
Joshua C. Denny
Kelsey Mayo
Margaret Sunitha Selvaraj
Mark Effingham
Melissa A. Basford
Naomi Allen
Nicole Deflaux
Pradeep Natarajan
Rory Collins
Sara Haidermota
Publication venue: Nature Portfolio
Publication date: 01/09/2023
Field of study

Abstract Recently, large scale genomic projects such as All of Us and the UK Biobank have introduced a new research paradigm where data are stored centrally in cloud-based Trusted Research Environments (TREs). To characterize the advantages and drawbacks of different TRE attributes in facilitating cross-cohort analysis, we conduct a Genome-Wide Association Study of standard lipid measures using two approaches: meta-analysis and pooled analysis. Comparison of full summary data from both approaches with an external study shows strong correlation of known loci with lipid levels (R2 ~ 83–97%). Importantly, 90 variants meet the significance threshold only in the meta-analysis and 64 variants are significant only in pooled analysis, with approximately 20% of variants in each of those groups being most prevalent in non-European, non-Asian ancestry individuals. These findings have important implications, as technical and policy choices lead to cross-cohort analyses generating similar, but not identical results, particularly for non-European ancestral populations

Directory of Open Access Journals

Facilitating pharmacogenetic studies using electronic health records and natural-language processing: a case study of warfarin

Author: Basford Melissa A
Bowton Erica A
Cowan James D
Crawford Dana C
Denny Joshua C
Jeff Janina M
Jiang Min
Masys Daniel R
Oetjens Matt
Pulley Jill M
Ramirez Andrea H
Ritchie Marylyn D
Roden Dan M
Wang Xiaoming
Xu Hua
Publication venue: BMJ Group
Publication date
Field of study

Crossref

PubMed Central

Assessment of a pharmacogenomic marker panel in a polypharmacy population identified from electronic medical records

Author: Dan M Roden
Dan R Masys
Dana C Crawford
Danielle M Richardson
Erica Bowton
Grady
Holli H Dilks
Jill M Pulley
Joshua C Denny
Marylyn D Ritchie
Matthew T Oetjens
Melissa A Basford
Nicole A Restrepo
Niloufar B Gillani
Russell A Wilke
Publication venue: 'Future Medicine Ltd'
Publication date
Field of study

Crossref

Candidate region targeted for fine-mapping.

Author: Abel N. Kho (529938)
Dan M. Roden (181660)
Dana C. Crawford (102216)
Janina M. Jeff (143364)
Jennifer A. Pacheco (496628)
Joshua C. Denny (107665)
Loren L. Armstrong (529937)
M. Geoffrey Hayes (327753)
Marylyn D. Ritchie (201209)
Melissa A. Basford (529939)
Rex L. Chisholm (14056)
Rongling Li (102079)
Wendy A. Wolf (529940)
Publication venue
Publication date
Field of study

<p>Using Seattle SNPs genome browser, the candidate genes located within 100<i>TCIRG1</i> gene, their orientation, and gene structure are displayed. SNPs annotated for these genes are located at the top of the figure denoted by hash marks. (Image generated from <a href="http://pga.gs.washington.edu/" target="_blank">http://pga.gs.washington.edu/</a>).</p

FigShare