29 research outputs found

    Microbiome preterm birth DREAM challenge: Crowdsourcing machine learning approaches to advance preterm birth research

    Get PDF
    Every year, 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. We crowdsource models to predict (1) preterm birth (PTB; \u3c37 \u3eweeks) or (2) early preterm birth (ePTB; \u3c32 \u3eweeks) from 9 vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from public raw data via phylogenetic harmonization. The predictive models are validated on two independent unpublished datasets representing 331 samples from 148 pregnant individuals. The top-performing models (among 148 and 121 submissions from 318 teams) achieve area under the receiver operator characteristic (AUROC) curve scores of 0.69 and 0.87 predicting PTB and ePTB, respectively. Alpha diversity, VALENCIA community state types, and composition are important features in the top-performing models, most of which are tree-based methods. This work is a model for translation of microbiome data into clinically relevant predictive models and to better understand preterm birth

    Cross-Tissue Transcriptomic Analysis Leveraging Machine Learning Approaches Identifies New Biomarkers for Rheumatoid Arthritis

    Get PDF
    There is an urgent need to identify biomarkers for diagnosis and disease activity monitoring in rheumatoid arthritis (RA). We leveraged publicly available microarray gene expression data in the NCBI GEO database for whole blood (N=1,885) and synovial (N=284) tissues from RA patients and healthy controls. We developed a robust machine learning feature selection pipeline with validation on five independent datasets culminating in 13 genes: TNFAIP6, S100A8, TNFSF10, DRAM1, LY96, QPCT, KYNU, ENTPD1, CLIC1, ATP6V0E1, HSP90AB1, NCL and CIRBP which define the RA score and demonstrate its clinical utility: the score tracks the disease activity DAS28 (p = 7e-9), distinguishes osteoarthritis (OA) from RA (OR 0.57, p = 8e-10) and polyJIA from healthy controls (OR 1.15, p = 2e-4) and monitors treatment effect in RA (p = 2e-4). Finally, the immunoblotting analysis of six proteins on an independent cohort confirmed two proteins, TNFAIP6/TSG6 and HSP90AB1/HSP90

    Large-scale placenta DNA methylation integrated analysis reveals fetal sex-specific differentially methylated CpG sites and regions.

    No full text
    Although male-female differences in placental structure and function have been observed, little is understood about their molecular underpinnings. Here, we present a mega-analysis of 14 publicly available placenta DNA methylation (DNAm) microarray datasets to identify individual CpGs and regions associated with fetal sex. In the discovery dataset of placentas from full term pregnancies (N = 532 samples), 5212 CpGs met genome-wide significance (p < 1E-8) and were enriched in pathways such as keratinization (FDR p-value = 7.37E-14), chemokine activity (FDR p-value = 1.56E-2), and eosinophil migration (FDR p-value = 1.83E-2). Nine differentially methylated regions were identified (fwerArea < 0.1) including a region in the promoter of ZNF300 that showed consistent differential DNAm in samples from earlier timepoints in pregnancy and appeared to be driven predominately by effects in the trophoblast cell type. We describe the largest study of fetal sex differences in placenta DNAm performed to date, revealing genes and pathways characterizing sex-specific placenta function and health outcomes later in life

    Similarities and differences in Alzheimer’s dementia comorbidities in racialized populations identified from electronic medical records

    No full text
    Woldemariam et al. use electronic medical records to explore comorbidities in individuals with Alzheimer’s Disease stratified by four identified race and ethnicity categories. Whilst most comorbidities are similar, a few comorbidities, including respiratory diseases, are associated with Black- and Latine- identified individuals
    corecore