27,720 research outputs found
Empirical best prediction under a nested error model with log transformation
In regression models involving economic variables such as income, log
transformation is typically taken to achieve approximate normality and stabilize
the variance. However, often the interest is predicting individual values
or means of the variable in the original scale. Under a nested error model for
the log transformation of the target variable, we show that the usual approach
of back transforming the predicted values may introduce a substantial bias.
We obtain the optimal (or “best”) predictors of individual values of the original
variable and of small area means under that model. Empirical best predictors
are defined by estimating the unknown model parameters in the best
predictors. When estimation is desired for subpopulations with small sample
sizes (small areas), nested error models are widely used to “borrow strength”
from the other areas and obtain estimators with greater efficiency than direct
estimators based on the scarce area-specific data. We show that naive predictors
of small area means obtained by back-transformation under the mentioned
model may even underperform direct estimators. Moreover, assessing
the uncertainty of the considered predictor is not straightforward. Exact mean
squared errors of the best predictors and second-order approximations to the
mean squared errors of the empirical best predictors are derived. Estimators
of the mean squared errors that are second-order correct are also obtained.
Simulation studies and an example with Mexican data on living conditions
illustrate the procedures.Supported by the Spanish Grants SEJ-2007-64500 and MTM2012-37077-C02-01. Supported by the Spanish Grants MTM-2012-33740 and ECO-2011-25706
Nonparametric estimation of mean-squared prediction error in nested-error regression models
Nested-error regression models are widely used for analyzing clustered data.
For example, they are often applied to two-stage sample surveys, and in biology
and econometrics. Prediction is usually the main goal of such analyses, and
mean-squared prediction error is the main way in which prediction performance
is measured. In this paper we suggest a new approach to estimating mean-squared
prediction error. We introduce a matched-moment, double-bootstrap algorithm,
enabling the notorious underestimation of the naive mean-squared error
estimator to be substantially reduced. Our approach does not require specific
assumptions about the distributions of errors. Additionally, it is simple and
easy to apply. This is achieved through using Monte Carlo simulation to
implicitly develop formulae which, in a more conventional approach, would be
derived laboriously by mathematical arguments.Comment: Published at http://dx.doi.org/10.1214/009053606000000579 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Nonparametric estimation of mean-squared prediction error in nested-error regression models
Nested-error regression models are widely used for analyzing clustered data.
For example, they are often applied to two-stage sample surveys, and in biology
and econometrics. Prediction is usually the main goal of such analyses, and
mean-squared prediction error is the main way in which prediction performance
is measured. In this paper we suggest a new approach to estimating mean-squared
prediction error. We introduce a matched-moment, double-bootstrap algorithm,
enabling the notorious underestimation of the naive mean-squared error
estimator to be substantially reduced. Our approach does not require specific
assumptions about the distributions of errors. Additionally, it is simple and
easy to apply. This is achieved through using Monte Carlo simulation to
implicitly develop formulae which, in a more conventional approach, would be
derived laboriously by mathematical arguments.Supported in part by NSF Grant SES-03-18184
Small area estimation and prediction problems: spatial models, Bayesian multiple comparisons and robust MSE estimation
We study and partially solve three distinct problems in small area estimation. The problems are loosely connected by a common theme of
prediction and (empirical) Bayesian models.
In the first part of the thesis we consider prediction in a survey small area context with spatially correlated errors. We introduce a
novel asymptotic framework in which the spatially correlated small areas form clusters, the number of such clusters and the number of
small areas in each cluster growing with sample size. Under such an asymptotic framework we show consistency and asymptotic normality of the parameter estimators. For empirical predictors based on model estimates, we show through simulation and a real data example, improved prediction over estimates ignoring spatial
error-correlations.
The second part of the thesis involves using a hierarchical Bayes approach to solve the problem of multiple comparison in small area estimation. In the context of multiple comparison, a new class of moment matching priors is introduced. This class includes the well-known superharmonic prior due to Stein. Through data analysis and simulation we illustrate the use of our class of
priors.
In the third part of the thesis, for a special case of the nested error regression model, we derive a non-parametric second order unbiased estimator of the mean squared error of the empirical best linear unbiased predictor. For the balanced case, the Prasad-Rao estimator is shown to be second order unbiased when the small area effects are non-normal. Through simulation we show that the Prasad-Rao estimator is robust for departures from normality
Mean squared error of empirical predictor
The term ``empirical predictor'' refers to a two-stage predictor of a linear
combination of fixed and random effects. In the first stage, a predictor is
obtained but it involves unknown parameters; thus, in the second stage, the
unknown parameters are replaced by their estimators. In this paper, we consider
mean squared errors (MSE) of empirical predictors under a general setup, where
ML or REML estimators are used for the second stage. We obtain second-order
approximation to the MSE as well as an estimator of the MSE correct to the same
order. The general results are applied to mixed linear models to obtain a
second-order approximation to the MSE of the empirical best linear unbiased
predictor (EBLUP) of a linear mixed effect and an estimator of the MSE of EBLUP
whose bias is correct to second order. The general mixed linear model includes
the mixed ANOVA model and the longitudinal model as special cases
Small Area Shrinkage Estimation
The need for small area estimates is increasingly felt in both the public and
private sectors in order to formulate their strategic plans. It is now widely
recognized that direct small area survey estimates are highly unreliable owing
to large standard errors and coefficients of variation. The reason behind this
is that a survey is usually designed to achieve a specified level of accuracy
at a higher level of geography than that of small areas. Lack of additional
resources makes it almost imperative to use the same data to produce small area
estimates. For example, if a survey is designed to estimate per capita income
for a state, the same survey data need to be used to produce similar estimates
for counties, subcounties and census divisions within that state. Thus, by
necessity, small area estimation needs explicit, or at least implicit, use of
models to link these areas. Improved small area estimates are found by
"borrowing strength" from similar neighboring areas.Comment: Published in at http://dx.doi.org/10.1214/11-STS374 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
New important developments in small area estimation
The purpose of this paper is to review and discuss some of the new important developments in small area estimation (SAE) methods. Rao (2003) wrote a very comprehensive book, which covers all the main developments in this topic until that time and so the focus of this review is on new developments in the last 7 years. However, to make the review more self contained, I also repeat shortly some of the older developments. The review covers both design based and model-dependent methods with emphasis on the prediction of the area target quantities and the assessment of the prediction error. The style of the paper is similar to the style of my previous review on SAE published in 2002, explaining the new problems investigated and describing the proposed solutions, but without dwelling on theoretical details, which can be found in the original articles. I am hoping that this paper will be useful both to researchers who like to learn more on the research carried out in SAE and to practitioners who might be interested in the application of the new methods
Uncertainty under a multivariate nested-error regression model with logarithmic transformation
Assuming a multivariate linear regression model with one random factor, we consider the parameters defined as exponentials of mixed effects, i.e., linear combinations of fixed and random effects. Such parameters are of particular interest in prediction problems where the dependent variable is the logarithm of the variable that is the object of inference. We derive bias-corrected empirical predictors of such parameters. A second order approximation for the mean crossed product error of the predictors of two of these parameters is obtained, and an estimator is derived from it. The mean squared error is obtained as a particular case
- …