61 research outputs found
Nonnegative mean squared prediction error estimation in small area estimation
Small area estimation has received enormous attention in recent years due to
its wide range of application, particularly in policy making decisions. The
variance based on direct sample size of small area estimator is unduly large
and there is a need of constructing model based estimator with low mean squared
prediction error (MSPE). Estimation of MSPE and in particular the bias
correction of MSPE plays the central piece of small area estimation research.
In this article, a new technique of bias correction for the estimated MSPE is
proposed. It is shown that that the new MSPE estimator attains the same level
of bias correction as the existing estimators based on straight Taylor
expansion and jackknife methods. However, unlike the existing methods, the
proposed estimate of MSPE is always nonnegative. Furthermore, the proposed
method can be used for general two-level small area models where the variables
at each level can be discrete or continuous and, in particular, be nonnormal.Comment: 21 Page
A penalized empirical likelihood method in high dimensions
This paper formulates a penalized empirical likelihood (PEL) method for
inference on the population mean when the dimension of the observations may
grow faster than the sample size. Asymptotic distributions of the PEL ratio
statistic is derived under different component-wise dependence structures of
the observations, namely, (i) non-Ergodic, (ii) long-range dependence and (iii)
short-range dependence. It follows that the limit distribution of the proposed
PEL ratio statistic can vary widely depending on the correlation structure, and
it is typically different from the usual chi-squared limit of the empirical
likelihood ratio statistic in the fixed and finite dimensional case. A unified
subsampling based calibration is proposed, and its validity is established in
all three cases, (i)-(iii). Finite sample properties of the method are
investigated through a simulation study.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1040 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Second Order Correctness of Perturbation Bootstrap M-Estimator of Multiple Linear Regression Parameter
Consider the multiple linear regression model , where 's are independent and
identically distributed random variables, 's are known design
vectors and is the vector of parameters. An
effective way of approximating the distribution of the M-estimator
, after proper centering and scaling, is the
Perturbation Bootstrap Method. In this current work, second order results of
this non-naive bootstrap method have been investigated. Second order
correctness is important for reducing the approximation error uniformly to
to get better inferences. We show that the classical studentized
version of the bootstrapped estimator fails to be second order correct. We
introduce an innovative modification in the studentized version of the
bootstrapped statistic and show that the modified bootstrapped pivot is second
order correct (S.O.C.) for approximating the distribution of the studentized
M-estimator. Additionally, we show that the Perturbation Bootstrap continues to
be S.O.C. when the errors 's are independent, but may not be
identically distributed. These findings establish perturbation Bootstrap
approximation as a significant improvement over asymptotic normality in the
regression M-estimation.Comment: key words: M-Estimation, S.O.C., Perturbation Bootstrap, Edgeworth
Expansion, Studentization, Residual Bootstrap, Generalized Bootstrap, Wild
Bootstra
Convergence rates of empirical block length selectors for block bootstrap
We investigate the accuracy of two general non-parametric methods for
estimating optimal block lengths for block bootstraps with time series - the
first proposed in the seminal paper of Hall, Horowitz and Jing (Biometrika 82
(1995) 561-574) and the second from Lahiri et al. (Stat. Methodol. 4 (2007)
292-321). The relative performances of these general methods have been unknown
and, to provide a comparison, we focus on rates of convergence for these block
length selectors for the moving block bootstrap (MBB) with variance estimation
problems under the smooth function model. It is shown that, with suitable
choice of tuning parameters, the optimal convergence rate of the first method
is where denotes the sample size. The optimal convergence
rate of the second method, with the same number of tuning parameters, is shown
to be , suggesting that the second method may generally have
better large-sample properties for block selection in block bootstrap
applications beyond variance estimation. We also compare the two general
methods with other plug-in methods specifically designed for block selection in
variance estimation, where the best possible convergence rate is shown to be
and achieved by a method from Politis and White (Econometric
Rev. 23 (2004) 53-70).Comment: Published in at http://dx.doi.org/10.3150/13-BEJ511 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
On optimal spatial subsample size for variance estimation
We consider the problem of determining the optimal block (or subsample) size
for a spatial subsampling method for spatial processes observed on regular
grids. We derive expansions for the mean square error of the subsampling
variance estimator, which yields an expression for the theoretically optimal
block size. The optimal block size is shown to depend in an intricate way on
the geometry of the spatial sampling region as well as characteristics of the
underlying random field. Final expressions for the optimal block size make use
of some nontrivial estimates of lattice point counts in shifts of convex sets.
Optimal block sizes are computed for sampling regions of a number of commonly
encountered shapes. Numerical studies are performed to compare subsampling
methods as well as procedures for estimating the theoretically best block size.Comment: Published at http://dx.doi.org/10.1214/009053604000000779 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A frequency domain empirical likelihood for short- and long-range dependence
This paper introduces a version of empirical likelihood based on the
periodogram and spectral estimating equations. This formulation handles
dependent data through a data transformation (i.e., a Fourier transform) and is
developed in terms of the spectral distribution rather than a time domain
probability distribution. The asymptotic properties of frequency domain
empirical likelihood are studied for linear time processes exhibiting both
short- and long-range dependence. The method results in likelihood ratios which
can be used to build nonparametric, asymptotically correct confidence regions
for a class of normalized (or ratio) spectral parameters, including
autocorrelations. Maximum empirical likelihood estimators are possible, as well
as tests of spectral moment conditions. The methodology can be applied to
several inference problems such as Whittle estimation and goodness-of-fit
testing.Comment: Published at http://dx.doi.org/10.1214/009053606000000902 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Sub-Gaussian Berry-Esseen Theorem for the Hypergeometric Distribution
In this paper, we derive a necessary and sufficient condition on the
parameters of the Hypergeometric distribution for weak convergence to a Normal
limit. We establish a Berry-Esseen theorem for the Hypergeometric distribution
solely under this necessary and sufficient condition. We further derive a
nonuniform Berry-Esseen bound where the tails of the difference between the
Hypergeometric and the Normal distribution functions are shown to decay at a
sub-Gaussian rate
A frequency domain empirical likelihood method for irregularly spaced spatial data
This paper develops empirical likelihood methodology for irregularly spaced
spatial data in the frequency domain. Unlike the frequency domain empirical
likelihood (FDEL) methodology for time series (on a regular grid), the
formulation of the spatial FDEL needs special care due to lack of the usual
orthogonality properties of the discrete Fourier transform for irregularly
spaced data and due to presence of nontrivial bias in the periodogram under
different spatial asymptotic structures. A spatial FDEL is formulated in the
paper taking into account the effects of these factors. The main results of the
paper show that Wilks' phenomenon holds for a scaled version of the logarithm
of the proposed empirical likelihood ratio statistic in the sense that it is
asymptotically distribution-free and has a chi-squared limit. As a result, the
proposed spatial FDEL method can be used to build nonparametric, asymptotically
correct confidence regions and tests for covariance parameters that are defined
through spectral estimating equations, for irregularly spaced spatial data. In
comparison to the more common studentization approach, a major advantage of our
method is that it does not require explicit estimation of the standard error of
an estimator, which is itself a very difficult problem as the asymptotic
variances of many common estimators depend on intricate interactions among
several population quantities, including the spectral density of the spatial
process, the spatial sampling density and the spatial asymptotic structure.
Results from a numerical study are also reported to illustrate the methodology
and its finite sample properties.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1291 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Resampling Based Empirical Prediction: An Application to Small Area Estimation
Best linear unbiased prediction is well known for its wide range of
applications including small area estimation. While the theory is well
established for mixed linear models and under normality of the error and mixing
distributions, the literature is sparse for nonlinear mixed models under
nonnormality of the error or of the mixing distributions. This article develops
a resampling based unified approach for predicting mixed effects under a
generalized mixed model set up. Second order accurate nonnegative estimators of
mean squared prediction errors are also developed. Given the parametric model,
the proposed methodology automatically produces estimates of the small area
parameters and their MSPEs, without requiring explicit analytical expressions
for the MSPE.Comment: 31 page
On Statistical Properties of A Veracity Scoring Method for Spatial Data
Measuring veracity or reliability of noisy data is of utmost importance,
especially in the scenarios where the information are gathered through
automated systems. In a recent paper, Chakraborty et. al. (2019) have
introduced a veracity scoring technique for geostatistical data. The authors
have used a high-quality `reference' data to measure the veracity of the
varying-quality observations and incorporated the veracity scores in their
analysis of mobile-sensor generated noisy weather data to generate efficient
predictions of the ambient temperature process. In this paper, we consider the
scenario when no reference data is available and hence, the veracity scores
(referred as VS) are defined based on `local' summaries of the observations. We
develop a VS-based estimation method for parameters of a spatial regression
model. Under a non-stationary noise structure and fairly general assumptions on
the underlying spatial process, we show that the VS-based estimators of the
regression parameters are consistent. Moreover, we establish the advantage of
the VS-based estimators as compared to the ordinary least squares (OLS)
estimator by analyzing their asymptotic mean squared errors. We illustrate the
merits of the VS-based technique through simulations and apply the methodology
to a real data set on mass percentages of ash in coal seams in Pennsylvania.Comment: 37 pages, 4 figures, 6 tables, submitted to JRSS-
- β¦