Streamlining Study Design and Statistical Analysis for Quality Improvement and Research Reproducibility

Abstract

Research Overview: This summarizes the current and future work done in streamlining the processes and methods involved with study design and statistical analyses in order to ensure quality of statistical methods and reproducibility of research. Objectives/Goals: Key factors causing irreproducibility of research include those related to inappropriate study design methodologies and statistical analysis. In modern statistical practice irreproducibility could arise due to statistical (false discoveries, p-hacking, overuse/misuse of p-values, low power, poor experimental design) and computational (data, code & software management) issues. These require understanding the processes and workflows practiced by an organization, and the development and use of metrics to quantify reproducibility. Methods/Study Population: Within the Foundation of Discovery - Population Health Research, Center for Clinical and Translational Science, University of Utah, we are undertaking a project to streamline the study design and statistical analysis workflows and processes. As a first step we met with key stakeholders to understand the current practices by eliciting example statistical projects, and then developed process information models for different types of statistical needs using Lucidchart. We then reviewed these with the Foundation’s leadership and the Standards Committee to come up with ideal workflows and model, and defined key measurement points (such as those around study design, analysis plan, final report, requirements for quality checks, and double coding) for assessing reproducibility. As next steps we are using our finding to embed analytical and infrastructural approaches within the statisticians’ workflows. This will include data and code dissemination platforms such as Box, Bitbucket and GitHub, documentation platforms such as Confluence, and workflow tracking platforms such as Jira. These tools will simplify and automate the capture of communications as a statistician work through a project. Data-intensive process will use process-workflow management platforms such as Activiti, Pegasus and Taverna. Results/Anticipated Results: These strategies for sharing and publishing study protocols, data, code and results across the spectrum, active collaboration with the research team, automation of key steps, along with decision support. Discussion/Significance of Impact: This analysis of the statistical methods and process and computational methods to automate them ensure quality of statistical methods and reproducibility of research

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 04/01/2018