Search CORE

61 research outputs found

Nonnegative mean squared prediction error estimation in small area estimation

Author: Lahiri Soumendra N.
Maiti Tapabrata
Publication venue
Publication date: 04/04/2006
Field of study

Small area estimation has received enormous attention in recent years due to its wide range of application, particularly in policy making decisions. The variance based on direct sample size of small area estimator is unduly large and there is a need of constructing model based estimator with low mean squared prediction error (MSPE). Estimation of MSPE and in particular the bias correction of MSPE plays the central piece of small area estimation research. In this article, a new technique of bias correction for the estimated MSPE is proposed. It is shown that that the new MSPE estimator attains the same level of bias correction as the existing estimators based on straight Taylor expansion and jackknife methods. However, unlike the existing methods, the proposed estimate of MSPE is always nonnegative. Furthermore, the proposed method can be used for general two-level small area models where the variables at each level can be discrete or continuous and, in particular, be nonnormal.Comment: 21 Page

arXiv.org e-Print Archive

A penalized empirical likelihood method in high dimensions

Author: Lahiri Soumendra N.
Mukhopadhyay Subhodeep
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 27/02/2013
Field of study

This paper formulates a penalized empirical likelihood (PEL) method for inference on the population mean when the dimension of the observations may grow faster than the sample size. Asymptotic distributions of the PEL ratio statistic is derived under different component-wise dependence structures of the observations, namely, (i) non-Ergodic, (ii) long-range dependence and (iii) short-range dependence. It follows that the limit distribution of the proposed PEL ratio statistic can vary widely depending on the correlation structure, and it is typically different from the usual chi-squared limit of the empirical likelihood ratio statistic in the fixed and finite dimensional case. A unified subsampling based calibration is proposed, and its validity is established in all three cases, (i)-(iii). Finite sample properties of the method are investigated through a simulation study.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1040 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Second Order Correctness of Perturbation Bootstrap M-Estimator of Multiple Linear Regression Parameter

Author: Das Debraj
Lahiri Soumendra Nath
Publication venue
Publication date: 17/12/2017
Field of study

Consider the multiple linear regression model

y_{i} = \boldsymbol{x}'_{i} \boldsymbol{\beta} + \epsilon_{i}

, where

\epsilon_i

's are independent and identically distributed random variables,

\mathbf{x}_i

's are known design vectors and

\boldsymbol{\beta}

is the

p \times 1

vector of parameters. An effective way of approximating the distribution of the M-estimator

\boldsymbol{\bar{\beta}}_n

, after proper centering and scaling, is the Perturbation Bootstrap Method. In this current work, second order results of this non-naive bootstrap method have been investigated. Second order correctness is important for reducing the approximation error uniformly to

o(n^{-1/2})

to get better inferences. We show that the classical studentized version of the bootstrapped estimator fails to be second order correct. We introduce an innovative modification in the studentized version of the bootstrapped statistic and show that the modified bootstrapped pivot is second order correct (S.O.C.) for approximating the distribution of the studentized M-estimator. Additionally, we show that the Perturbation Bootstrap continues to be S.O.C. when the errors

\epsilon_i

's are independent, but may not be identically distributed. These findings establish perturbation Bootstrap approximation as a significant improvement over asymptotic normality in the regression M-estimation.Comment: key words: M-Estimation, S.O.C., Perturbation Bootstrap, Edgeworth Expansion, Studentization, Residual Bootstrap, Generalized Bootstrap, Wild Bootstra

arXiv.org e-Print Archive

Convergence rates of empirical block length selectors for block bootstrap

Author: Lahiri Soumendra N.
Nordman Daniel J.
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 13/03/2014
Field of study

We investigate the accuracy of two general non-parametric methods for estimating optimal block lengths for block bootstraps with time series - the first proposed in the seminal paper of Hall, Horowitz and Jing (Biometrika 82 (1995) 561-574) and the second from Lahiri et al. (Stat. Methodol. 4 (2007) 292-321). The relative performances of these general methods have been unknown and, to provide a comparison, we focus on rates of convergence for these block length selectors for the moving block bootstrap (MBB) with variance estimation problems under the smooth function model. It is shown that, with suitable choice of tuning parameters, the optimal convergence rate of the first method is

O_p(n^{-1/6})

where

n

denotes the sample size. The optimal convergence rate of the second method, with the same number of tuning parameters, is shown to be

O_p(n^{-2/7})

, suggesting that the second method may generally have better large-sample properties for block selection in block bootstrap applications beyond variance estimation. We also compare the two general methods with other plug-in methods specifically designed for block selection in variance estimation, where the best possible convergence rate is shown to be

O_p(n^{-1/3})

and achieved by a method from Politis and White (Econometric Rev. 23 (2004) 53-70).Comment: Published in at http://dx.doi.org/10.3150/13-BEJ511 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

On optimal spatial subsample size for variance estimation

Author: Lahiri Soumendra N.
Nordman Daniel J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 29/03/2005
Field of study

We consider the problem of determining the optimal block (or subsample) size for a spatial subsampling method for spatial processes observed on regular grids. We derive expansions for the mean square error of the subsampling variance estimator, which yields an expression for the theoretically optimal block size. The optimal block size is shown to depend in an intricate way on the geometry of the spatial sampling region as well as characteristics of the underlying random field. Final expressions for the optimal block size make use of some nontrivial estimates of lattice point counts in shifts of convex sets. Optimal block sizes are computed for sampling regions of a number of commonly encountered shapes. Numerical studies are performed to compare subsampling methods as well as procedures for estimating the theoretically best block size.Comment: Published at http://dx.doi.org/10.1214/009053604000000779 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

A frequency domain empirical likelihood for short- and long-range dependence

Author: Lahiri Soumendra N.
Nordman Daniel J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/08/2007
Field of study

This paper introduces a version of empirical likelihood based on the periodogram and spectral estimating equations. This formulation handles dependent data through a data transformation (i.e., a Fourier transform) and is developed in terms of the spectral distribution rather than a time domain probability distribution. The asymptotic properties of frequency domain empirical likelihood are studied for linear time processes exhibiting both short- and long-range dependence. The method results in likelihood ratios which can be used to build nonparametric, asymptotically correct confidence regions for a class of normalized (or ratio) spectral parameters, including autocorrelations. Maximum empirical likelihood estimators are possible, as well as tests of spectral moment conditions. The methodology can be applied to several inference problems such as Whittle estimation and goodness-of-fit testing.Comment: Published at http://dx.doi.org/10.1214/009053606000000902 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

A Sub-Gaussian Berry-Esseen Theorem for the Hypergeometric Distribution

Author: Chatterjee A.
Lahiri Soumendra N.
Maiti T.
Publication venue
Publication date: 01/01/2006
Field of study

In this paper, we derive a necessary and sufficient condition on the parameters of the Hypergeometric distribution for weak convergence to a Normal limit. We establish a Berry-Esseen theorem for the Hypergeometric distribution solely under this necessary and sufficient condition. We further derive a nonuniform Berry-Esseen bound where the tails of the difference between the Hypergeometric and the Normal distribution functions are shown to decay at a sub-Gaussian rate

arXiv.org e-Print Archive

CiteSeerX

A frequency domain empirical likelihood method for irregularly spaced spatial data

Author: Bandyopadhyay Soutir
Lahiri Soumendra N.
Nordman Daniel J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 17/03/2015
Field of study

This paper develops empirical likelihood methodology for irregularly spaced spatial data in the frequency domain. Unlike the frequency domain empirical likelihood (FDEL) methodology for time series (on a regular grid), the formulation of the spatial FDEL needs special care due to lack of the usual orthogonality properties of the discrete Fourier transform for irregularly spaced data and due to presence of nontrivial bias in the periodogram under different spatial asymptotic structures. A spatial FDEL is formulated in the paper taking into account the effects of these factors. The main results of the paper show that Wilks' phenomenon holds for a scaled version of the logarithm of the proposed empirical likelihood ratio statistic in the sense that it is asymptotically distribution-free and has a chi-squared limit. As a result, the proposed spatial FDEL method can be used to build nonparametric, asymptotically correct confidence regions and tests for covariance parameters that are defined through spectral estimating equations, for irregularly spaced spatial data. In comparison to the more common studentization approach, a major advantage of our method is that it does not require explicit estimation of the standard error of an estimator, which is itself a very difficult problem as the asymptotic variances of many common estimators depend on intricate interactions among several population quantities, including the spectral density of the spatial process, the spatial sampling density and the spatial asymptotic structure. Results from a numerical study are also reported to illustrate the methodology and its finite sample properties.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1291 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Resampling Based Empirical Prediction: An Application to Small Area Estimation

Author: Katzoff Myron
Lahiri Soumendra N.
Maiti Tapabrata
Parsons Van
Publication venue
Publication date: 01/01/2006
Field of study

Best linear unbiased prediction is well known for its wide range of applications including small area estimation. While the theory is well established for mixed linear models and under normality of the error and mixing distributions, the literature is sparse for nonlinear mixed models under nonnormality of the error or of the mixing distributions. This article develops a resampling based unified approach for predicting mixed effects under a generalized mixed model set up. Second order accurate nonnegative estimators of mean squared prediction errors are also developed. Given the parametric model, the proposed methodology automatically produces estimates of the small area parameters and their MSPEs, without requiring explicit analytical expressions for the MSPE.Comment: 31 page

arXiv.org e-Print Archive

CiteSeerX

On Statistical Properties of A Veracity Scoring Method for Spatial Data

Author: Chakraborty Arnab
Lahiri Soumendra N.
Publication venue
Publication date: 20/06/2019
Field of study

Measuring veracity or reliability of noisy data is of utmost importance, especially in the scenarios where the information are gathered through automated systems. In a recent paper, Chakraborty et. al. (2019) have introduced a veracity scoring technique for geostatistical data. The authors have used a high-quality `reference' data to measure the veracity of the varying-quality observations and incorporated the veracity scores in their analysis of mobile-sensor generated noisy weather data to generate efficient predictions of the ambient temperature process. In this paper, we consider the scenario when no reference data is available and hence, the veracity scores (referred as VS) are defined based on `local' summaries of the observations. We develop a VS-based estimation method for parameters of a spatial regression model. Under a non-stationary noise structure and fairly general assumptions on the underlying spatial process, we show that the VS-based estimators of the regression parameters are consistent. Moreover, we establish the advantage of the VS-based estimators as compared to the ordinary least squares (OLS) estimator by analyzing their asymptotic mean squared errors. We illustrate the merits of the VS-based technique through simulations and apply the methodology to a real data set on mass percentages of ash in coal seams in Pennsylvania.Comment: 37 pages, 4 figures, 6 tables, submitted to JRSS-

arXiv.org e-Print Archive