The Paradata Information Model

Abstract

Presentation at the North American Data Documentation Conference (NADDI) 2013Paradata is data on study processes and the collection of study data. Here we describe the development of a Paradata Information Model (PIM) in support of the National Children¹s Study (NCS) of the Eunice Kennedy Shriver National Institute of Child Health and Development. We propose that paradata can be recorded with accompanying metadata informed by the General Longitudinal Business Process Model (GLBPM) developed by the Data Documentation Initiative (DDI) and the General Statistical Business Process Model (GSBPM). The PIM is be constructed in a joint top-down and bottom-up approach, appropriating broad verbs from DDI, HL7, LS-DAM, and CDISC, while incorporating study-specific processes involved in collecting NCS operational data elements (ODEs). The hope of paradata in longitudinal studies is that the collection of paradata will ensure that future researchers can integrate disparate data sets collected by a variety of technologies, especially in rapidly-evolving fields like genomics. Additionally, by giving PIM elements preconditions and postconditions, we can develop software agents which use paradata metadata as well as other information to assist humans in conducting biomedical research, ultimately facilitating more rapid collection and analysis of information and enabling a broader subset of researchers to discover and extract relevant information from study data sets.Institute for Policy & Social Research, University of Kansas; University of Kansas Libraries; Alfred P. Sloan Foundation; Data Documentation Initiative Alliance, Booz Allen Hamilto

    Similar works