When do covariates matter? and which ones, and how much

Abstract

Abstract Many authors add variables sequentially to their covariate sets when using linear estimators to investigate the effect of a variable of interest X 1 , on some outcome y. One justification for this practice involves robustness: if estimates of the coefficient on X 1 are stable across specifications, then researchers conclude that their findings are robust. A second justification involves accounting: by measuring the difference in X 1 's estimated coefficient as they add sets of covariates to the specification, researchers sometimes claim to have measured the effects of covariate variation on this coefficient. In this paper, I show that sequential covariate addition can be very misleading. The relationship between X 1 and a given covariate set may be sensitive to the order in which other covariates have been added. This sensitivity is especially problematic for accounting exercises, as I show using the canonical example of the black-white wage gap. The paper's main contribution is to show how to use the population and sample omitted variables bias formulas to define an economically and econometrically meaningful conditional decomposition that explains how much various covariates account for sensitivity in the estimated coefficient on X 1 . I illustrate the conditional decomposition using NLSY data on the black-white wage gap, with interesting empirical results. I also briefly discuss several extensions, including: instrumental variables estimators; the fact that my decomposition nests the Oaxaca-Blinder decomposition; and using the properties of the omitted variables bias formula to construct a Hausman test for cross-specification differences in coefficient estimates under the null that X 1 and X 2 are uncorrelated. Finally, I provide asymptotic variance formulas in an appendix, as well as a link to Stata code that implements my estimators. * I would like to than

    Similar works

    Full text

    thumbnail-image

    Available Versions