Missing ordinal covariates with informative selection

Abstract

This paper considers the problem of parameter estimation in a model for a continuous response variable y when an important ordinal explanatory variable x is missing for a large proportion of the sample. Non-missingness of x, or sample selection, is correlated with the response variable and/or with the unobserved values the ordinal explanatory variable takes when missing. We suggest solving the endogenous selection, or 'not missing at random' (NMAR), problem by modelling the informative selection mechanism, the ordinal explanatory variable, and the response variable together. The use of the method is illustrated by re-examining the problem of the ethnic gap in school achievement at age 16 in England using linked data from the National Pupil database (NPD), the Longitudinal Study of Young People in England (LSYPE), and the Census 2001

    Similar works