2 research outputs found
Get real: A synthetic dataset illustrating clinical and genetic covariates
Poster presentation at BD2K All Hands Meeting 2016 (https://datascience.nih.gov/bd2k/AHM) discussing BD2K funded work to develop a script to generate realistic synthetic datasets for hands-on learning in BD2K workshops
CVD Risk Prediction Synthetic Dataset
This is a synthetic dataset to teach students about using clinical and genetic covariates to predict cardiovascular risk in a realistic (but synthetic) dataset.<br><p>For the workshop materials, please go here: https://github.com/laderast/cvdNight1</p><p>Contents:</p><p>1) dataDictionary.pdf - pdf file describing all covariates in the synthetic dataset.</p><p>2) fullPatientData.csv - csv file with multiple covariates</p><p>3) genoData.csv - subset of patients in fullPatientData.csv with additional SNP calls.</p><p><br></p