A Small Area Procedure for Estimating Population Counts

Abstract

Many large scale surveys are designed to achieve acceptable reliability for large domains. Direct estimators for more detailed levels of aggregation are often judged to be unreliable due to small sample sizes. Estimation for small domains, often defined by geographic and demographic characteristics, is known as small area estimation. A common approach to small area estimation is to derive predictors under a specified mixed model for the direct estimators. A procedure of this type is developed for small areas defined by the cells of a two-way table. Construction of small domain estimators using the Canadian Labour Force Survey (LFS) motivates the proposed model and estimation procedures. The LFS is designed to produce estimates of employment characteristics for certain pre-specified geographic and demographic domains. Direct estimators for specific occupations in small provinces are not published due to large estimated coefficients of variation. A preliminary study conducted in cooperation with Statistics Canada investigated estimation procedures for small areas defined by the cross-classification of occupations and provinces using data from a previous Census as auxiliary information. For consistency with published estimates, predictors are desired that preserve the direct estimators of the margins of the two-way table. One method in the Statistics Canada study is based on a nonlinear mixed model for the direct estimators of the proportions. An initial predictor is defined to be a convex combination of the direct estimator and an estimator obtained by raking the Census totals to the direct estimators of the marginal totals. The estimators resulting from the raking operation are called the SPREE estimators and are expected to have smaller variances than the direct estimators. The weight assigned to the direct estimator depends on the relative magnitudes of an estimator of a random model component and an estimator of the sampling variance. The final predictors are defined by raking the initial predictors to the direct estimators of the marginal totals. Estimation of the mean squared error (MSE) of the predictors was not fully developed. This dissertation addresses several issues raised by the procedure discussed above. First, the method above uses SPREE to estimate a fixed expected value. SPREE is unbiased if the Census interactions persist unchanged through time and is efficient if the direct estimators of the cell totals are realizations of independent Poisson random variables. A generalization of SPREE that is more efficient under a specified covariance structure is explored. A simulation study shows that predictors constructed under the specified covariance structure can have smaller MSE\u27s than predictors calculated with the direct estimators of the variances. An estimator of the MSE of the initial convex combination of the direct estimator and the estimator of the fixed expected value is derived using Taylor linearizations. The LFS procedure uses a final raking operation to benchmark the predictors. A bootstrap procedure is investigated as a way to account for the effects of raking on the MSE\u27s of the predictors. The procedures are applied to the Canadian Labour Force Survey, but the issues discussed are of general interest because they arise in many small area applications

    Similar works