County crop yield data from USDA-NASS are extensively used in the literature as well as practice. In many applications, yield data are adjusted for the first two moments then assumed independent and identically distributed. For most major crop-region combinations, yield data exist from 1955 onwards and reflect significant innovations in both seed and farm management technologies. These innovations have likely changed the yield distribution raising doubt regarding the identically distributed assumption. We consider the question of how much historical yield data should be used in empirical analyses. First, we use distributional tests to assess if and when the adjusted yield data result from different DGPs. Second, we consider the application to crop insurance by using an out-of-sample rating. Third, we estimate DGPs and then simulate to quantify the additional error. Overall, the results indicate that using yield data more than 30 years old can substantially increase estimation error. Given that discarding data is unappetizing, we propose three methodologies that can re-incorporate the discarded data. Our results suggest gains in efficiency by using these methodologies. While our results are most applicable to the crop insurance literature, we certainly feel they suggest proceeding with caution when using historical yield data in other applications.
Acknowledgement