Imputation Strategies for Different Categories of Missing Data

Abstract

Addressing missing data in research is crucial for ensuring the reliability and validity of study findings, yet it remains a significant challenge. This study investigates the impact of missing data on research outcomes and explores the underutilization of existing tools for managing missingness, potentially leading to gaps in critical information with tangible implications for decision-making processes (Dziura et al.). Focusing on the different categories of missing data—Missing Completely At Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR)—this research examines various imputation strategies tailored to each category. Specifically, we compare the efficacy of several model-based imputation methods, including K-Nearest Neighbors, Maximum Likelihood Estimation, and Stepwise Regression, in predicting missing values. Through comprehensive analysis and comparison, this study aims to identify the most effective imputation approach for addressing missing data, thereby enhancing the robustness and reliability of research findings in both academic and practical contexts

Similar works

This paper was published in UNH Scholars' Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.