Data pre-processing for the preterm prediction study MFMU dataset

Diab, Hatim; Radeva, Axinia; Raja, Anita; Rajan, Ashwath; Salleb-Aouissi, Ansaf; Tomar, Ashish; Vovsha, Ilia

Data pre-processing for the preterm prediction study MFMU dataset

Authors: Hatim Diab
Axinia Radeva
Anita Raja
Ashwath Rajan
Ansaf Salleb-Aouissi
Ashish Tomar
Ilia Vovsha
Publication date: 1 January 2013
Publisher: 'Columbia University Libraries/Information Services'
Doi

Abstract

Preterm birth is a major public health problem with profound implications on society. There would be extreme value in being able to identify women at risk of preterm birth during the course of their pregnancy. Previous research has largely focused on individual risk factors correlated with preterm birth (e.g. prior preterm birth, race, and infection) and less on combining these factors in a way to understand the complex etiologies of preterm birth. We attempt to address this gap by conducting a deeper analysis of the preterm prediction study data collected by the NICHD Maternal Fetal Medicine Units (MFMU) Network, a high-quality data for over 3,000 singleton pregnancies having detailed study visits and biospecimen collection at 24, 26, 28 and 30 weeks gestation. Reports from this dataset used relatively straightforward biostatitistical methodologies such as relative risk assessments to measure associations between risk factors and PTB (Maternal Fetal Medicine Units Net- work. Biostatistical Coordinating Center NICHD Networks, 1995). These methods include descriptive statistics, Pearson correlation, Fisher’s exact tests and linear/logistic regression where risk factors are studied independent of each other. In order to perform detailed experiments on this data using non-linear Support Vector Machines and other machine learning (ML) methodologies, it is necessary to complete several pre-processing steps that we describe in this report

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Sustaining member

Columbia University Academic Commons

oai:academiccommons.columbia.e...

Last time updated on 02/10/2018