When do covariates matter? and which ones, and how much

Alex Whalley; Bill Evans; David Card; Ingmar Prucha; Jim Smith; Jon Klick; Jonah B Gelbach; Judy Hellerstein; Justin; Justin Mccrary; Kei Hirano; Mark Duggan; Steve Haider

When do covariates matter? and which ones, and how much

Authors: Alex Whalley
Bill Evans
David Card
Ingmar Prucha
Jim Smith
Jon Klick
Jonah B Gelbach
Judy Hellerstein
Justin
Justin Mccrary
Kei Hirano
Mark Duggan
Steve Haider
Publication date: 1 January 2009
Publisher

Abstract

Abstract Many authors add variables sequentially to their covariate sets when using linear estimators to investigate the effect of a variable of interest X 1 , on some outcome y. One justification for this practice involves robustness: if estimates of the coefficient on X 1 are stable across specifications, then researchers conclude that their findings are robust. A second justification involves accounting: by measuring the difference in X 1 's estimated coefficient as they add sets of covariates to the specification, researchers sometimes claim to have measured the effects of covariate variation on this coefficient. In this paper, I show that sequential covariate addition can be very misleading. The relationship between X 1 and a given covariate set may be sensitive to the order in which other covariates have been added. This sensitivity is especially problematic for accounting exercises, as I show using the canonical example of the black-white wage gap. The paper's main contribution is to show how to use the population and sample omitted variables bias formulas to define an economically and econometrically meaningful conditional decomposition that explains how much various covariates account for sensitivity in the estimated coefficient on X 1 . I illustrate the conditional decomposition using NLSY data on the black-white wage gap, with interesting empirical results. I also briefly discuss several extensions, including: instrumental variables estimators; the fact that my decomposition nests the Oaxaca-Blinder decomposition; and using the properties of the omitted variables bias formula to construct a Hausman test for cross-specification differences in coefficient estimates under the null that X 1 and X 2 are uncorrelated. Finally, I provide asymptotic variance formulas in an appendix, as well as a link to Stata code that implements my estimators. * I would like to than

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.1069....

Last time updated on 07/12/2020