61 research outputs found

    Nonnegative mean squared prediction error estimation in small area estimation

    Full text link
    Small area estimation has received enormous attention in recent years due to its wide range of application, particularly in policy making decisions. The variance based on direct sample size of small area estimator is unduly large and there is a need of constructing model based estimator with low mean squared prediction error (MSPE). Estimation of MSPE and in particular the bias correction of MSPE plays the central piece of small area estimation research. In this article, a new technique of bias correction for the estimated MSPE is proposed. It is shown that that the new MSPE estimator attains the same level of bias correction as the existing estimators based on straight Taylor expansion and jackknife methods. However, unlike the existing methods, the proposed estimate of MSPE is always nonnegative. Furthermore, the proposed method can be used for general two-level small area models where the variables at each level can be discrete or continuous and, in particular, be nonnormal.Comment: 21 Page

    A penalized empirical likelihood method in high dimensions

    Full text link
    This paper formulates a penalized empirical likelihood (PEL) method for inference on the population mean when the dimension of the observations may grow faster than the sample size. Asymptotic distributions of the PEL ratio statistic is derived under different component-wise dependence structures of the observations, namely, (i) non-Ergodic, (ii) long-range dependence and (iii) short-range dependence. It follows that the limit distribution of the proposed PEL ratio statistic can vary widely depending on the correlation structure, and it is typically different from the usual chi-squared limit of the empirical likelihood ratio statistic in the fixed and finite dimensional case. A unified subsampling based calibration is proposed, and its validity is established in all three cases, (i)-(iii). Finite sample properties of the method are investigated through a simulation study.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1040 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Second Order Correctness of Perturbation Bootstrap M-Estimator of Multiple Linear Regression Parameter

    Full text link
    Consider the multiple linear regression model yi=xiβ€²Ξ²+Ο΅iy_{i} = \boldsymbol{x}'_{i} \boldsymbol{\beta} + \epsilon_{i}, where Ο΅i\epsilon_i's are independent and identically distributed random variables, xi\mathbf{x}_i's are known design vectors and Ξ²\boldsymbol{\beta} is the pΓ—1p \times 1 vector of parameters. An effective way of approximating the distribution of the M-estimator Ξ²Λ‰n\boldsymbol{\bar{\beta}}_n, after proper centering and scaling, is the Perturbation Bootstrap Method. In this current work, second order results of this non-naive bootstrap method have been investigated. Second order correctness is important for reducing the approximation error uniformly to o(nβˆ’1/2)o(n^{-1/2}) to get better inferences. We show that the classical studentized version of the bootstrapped estimator fails to be second order correct. We introduce an innovative modification in the studentized version of the bootstrapped statistic and show that the modified bootstrapped pivot is second order correct (S.O.C.) for approximating the distribution of the studentized M-estimator. Additionally, we show that the Perturbation Bootstrap continues to be S.O.C. when the errors Ο΅i\epsilon_i's are independent, but may not be identically distributed. These findings establish perturbation Bootstrap approximation as a significant improvement over asymptotic normality in the regression M-estimation.Comment: key words: M-Estimation, S.O.C., Perturbation Bootstrap, Edgeworth Expansion, Studentization, Residual Bootstrap, Generalized Bootstrap, Wild Bootstra

    Convergence rates of empirical block length selectors for block bootstrap

    Full text link
    We investigate the accuracy of two general non-parametric methods for estimating optimal block lengths for block bootstraps with time series - the first proposed in the seminal paper of Hall, Horowitz and Jing (Biometrika 82 (1995) 561-574) and the second from Lahiri et al. (Stat. Methodol. 4 (2007) 292-321). The relative performances of these general methods have been unknown and, to provide a comparison, we focus on rates of convergence for these block length selectors for the moving block bootstrap (MBB) with variance estimation problems under the smooth function model. It is shown that, with suitable choice of tuning parameters, the optimal convergence rate of the first method is Op(nβˆ’1/6)O_p(n^{-1/6}) where nn denotes the sample size. The optimal convergence rate of the second method, with the same number of tuning parameters, is shown to be Op(nβˆ’2/7)O_p(n^{-2/7}), suggesting that the second method may generally have better large-sample properties for block selection in block bootstrap applications beyond variance estimation. We also compare the two general methods with other plug-in methods specifically designed for block selection in variance estimation, where the best possible convergence rate is shown to be Op(nβˆ’1/3)O_p(n^{-1/3}) and achieved by a method from Politis and White (Econometric Rev. 23 (2004) 53-70).Comment: Published in at http://dx.doi.org/10.3150/13-BEJ511 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    On optimal spatial subsample size for variance estimation

    Full text link
    We consider the problem of determining the optimal block (or subsample) size for a spatial subsampling method for spatial processes observed on regular grids. We derive expansions for the mean square error of the subsampling variance estimator, which yields an expression for the theoretically optimal block size. The optimal block size is shown to depend in an intricate way on the geometry of the spatial sampling region as well as characteristics of the underlying random field. Final expressions for the optimal block size make use of some nontrivial estimates of lattice point counts in shifts of convex sets. Optimal block sizes are computed for sampling regions of a number of commonly encountered shapes. Numerical studies are performed to compare subsampling methods as well as procedures for estimating the theoretically best block size.Comment: Published at http://dx.doi.org/10.1214/009053604000000779 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A frequency domain empirical likelihood for short- and long-range dependence

    Full text link
    This paper introduces a version of empirical likelihood based on the periodogram and spectral estimating equations. This formulation handles dependent data through a data transformation (i.e., a Fourier transform) and is developed in terms of the spectral distribution rather than a time domain probability distribution. The asymptotic properties of frequency domain empirical likelihood are studied for linear time processes exhibiting both short- and long-range dependence. The method results in likelihood ratios which can be used to build nonparametric, asymptotically correct confidence regions for a class of normalized (or ratio) spectral parameters, including autocorrelations. Maximum empirical likelihood estimators are possible, as well as tests of spectral moment conditions. The methodology can be applied to several inference problems such as Whittle estimation and goodness-of-fit testing.Comment: Published at http://dx.doi.org/10.1214/009053606000000902 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Sub-Gaussian Berry-Esseen Theorem for the Hypergeometric Distribution

    Full text link
    In this paper, we derive a necessary and sufficient condition on the parameters of the Hypergeometric distribution for weak convergence to a Normal limit. We establish a Berry-Esseen theorem for the Hypergeometric distribution solely under this necessary and sufficient condition. We further derive a nonuniform Berry-Esseen bound where the tails of the difference between the Hypergeometric and the Normal distribution functions are shown to decay at a sub-Gaussian rate

    A frequency domain empirical likelihood method for irregularly spaced spatial data

    Full text link
    This paper develops empirical likelihood methodology for irregularly spaced spatial data in the frequency domain. Unlike the frequency domain empirical likelihood (FDEL) methodology for time series (on a regular grid), the formulation of the spatial FDEL needs special care due to lack of the usual orthogonality properties of the discrete Fourier transform for irregularly spaced data and due to presence of nontrivial bias in the periodogram under different spatial asymptotic structures. A spatial FDEL is formulated in the paper taking into account the effects of these factors. The main results of the paper show that Wilks' phenomenon holds for a scaled version of the logarithm of the proposed empirical likelihood ratio statistic in the sense that it is asymptotically distribution-free and has a chi-squared limit. As a result, the proposed spatial FDEL method can be used to build nonparametric, asymptotically correct confidence regions and tests for covariance parameters that are defined through spectral estimating equations, for irregularly spaced spatial data. In comparison to the more common studentization approach, a major advantage of our method is that it does not require explicit estimation of the standard error of an estimator, which is itself a very difficult problem as the asymptotic variances of many common estimators depend on intricate interactions among several population quantities, including the spectral density of the spatial process, the spatial sampling density and the spatial asymptotic structure. Results from a numerical study are also reported to illustrate the methodology and its finite sample properties.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1291 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Resampling Based Empirical Prediction: An Application to Small Area Estimation

    Full text link
    Best linear unbiased prediction is well known for its wide range of applications including small area estimation. While the theory is well established for mixed linear models and under normality of the error and mixing distributions, the literature is sparse for nonlinear mixed models under nonnormality of the error or of the mixing distributions. This article develops a resampling based unified approach for predicting mixed effects under a generalized mixed model set up. Second order accurate nonnegative estimators of mean squared prediction errors are also developed. Given the parametric model, the proposed methodology automatically produces estimates of the small area parameters and their MSPEs, without requiring explicit analytical expressions for the MSPE.Comment: 31 page

    On Statistical Properties of A Veracity Scoring Method for Spatial Data

    Full text link
    Measuring veracity or reliability of noisy data is of utmost importance, especially in the scenarios where the information are gathered through automated systems. In a recent paper, Chakraborty et. al. (2019) have introduced a veracity scoring technique for geostatistical data. The authors have used a high-quality `reference' data to measure the veracity of the varying-quality observations and incorporated the veracity scores in their analysis of mobile-sensor generated noisy weather data to generate efficient predictions of the ambient temperature process. In this paper, we consider the scenario when no reference data is available and hence, the veracity scores (referred as VS) are defined based on `local' summaries of the observations. We develop a VS-based estimation method for parameters of a spatial regression model. Under a non-stationary noise structure and fairly general assumptions on the underlying spatial process, we show that the VS-based estimators of the regression parameters are consistent. Moreover, we establish the advantage of the VS-based estimators as compared to the ordinary least squares (OLS) estimator by analyzing their asymptotic mean squared errors. We illustrate the merits of the VS-based technique through simulations and apply the methodology to a real data set on mass percentages of ash in coal seams in Pennsylvania.Comment: 37 pages, 4 figures, 6 tables, submitted to JRSS-
    • …
    corecore