44,990 research outputs found

    Optimal variance estimation without estimating the mean function

    Full text link
    We study the least squares estimator in the residual variance estimation context. We show that the mean squared differences of paired observations are asymptotically normally distributed. We further establish that, by regressing the mean squared differences of these paired observations on the squared distances between paired covariates via a simple least squares procedure, the resulting variance estimator is not only asymptotically normal and root-nn consistent, but also reaches the optimal bound in terms of estimation variance. We also demonstrate the advantage of the least squares estimator in comparison with existing methods in terms of the second order asymptotic properties.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ432 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Robust state estimation using mixed integer programming

    Get PDF
    This letter describes a robust state estimator based on the solution of a mixed integer program. A tolerance range is associated with each measurement and an estimate is chosen to maximize the number of estimated measurements that remain within tolerance (or equivalently minimize the number of measurements out of tolerance). Some small-scale examples are given which suggest that this approach is robust in the presence of gross errors, is not susceptible to leverage points, and can solve some pathological cases that have previously caused problems for robust estimation algorithms

    Count data models with variance of unknown form: an application to a hedonic model of worker absenteeism

    Get PDF
    We examined an econometric model of counts of worker absences due to illness. The underlying theoretical model is of a sluggishly adjusting hedonic labor market. We compared results fromı three parametric estimators, nonlinear least squares plus Poissonand negative binomial pseudo maximum likelihood, to generalized least squares using nonparametric estimates of the conditional variance. Our data support the hedonic model of worker absenteeism. Semiparametric generalized least squares coefficients are similar in sign, magnitude, and statistical significance to their econometric analogs where the mean and variance of the errors were specified ex ante. Overdispersion test reject the Poisson specification. Robustness checks confirm that in our dataı parameter estimates are sensitive to regressor list but are not sensitive to econometric technique, including how we corrected for possible heteroskedasticity of unknown form

    Least absolute deviation estimation of linear econometric models: A literature review

    Get PDF
    Econometricians generally take for granted that the error terms in the econometric models are generated by distributions having a finite variance. However, since the time of Pareto the existence of error distributions with infinite variance is known. Works of many econometricians, namely, Meyer & Glauber (1964), Fama (1965) and Mandlebroth (1967), on economic data series like prices in financial and commodity markets confirm that infinite variance distributions exist abundantly. The distribution of firms by size, behaviour of speculative prices and various other recent economic phenomena also display similar trends. Further, econometricians generally assume that the disturbance term, which is an influence of innumerably many factors not accounted for in the model, approaches normality according to the Central Limit Theorem. But Bartels (1977) is of the opinion that there are limit theorems, which are just likely to be relevant when considering the sum of number of components in a regression disturbance that leads to non-normal stable distribution characterized by infinite variance. Thus, the possibility of the error term following a non-normal distribution exists. The Least Squares method of estimation of parameters of linear (regression) models performs well provided that the residuals (disturbances or errors) are well behaved (preferably normally or near-normally distributed and not infested with large size outliers) and follow Gauss-Markov assumptions. However, models with the disturbances that are prominently non-normally distributed and contain sizeable outliers fail estimation by the Least Squares method. An intensive research has established that in such cases estimation by the Least Absolute Deviation (LAD) method performs well. This paper is an attempt to survey the literature on LAD estimation of single as well as multi-equation linear econometric models.Lad estimator; Least absolute deviation estimation; econometric model; LAD Estimator; Minimum Absolute Deviation; Robust; Outliers; L1 Estimator; Review of literature

    Count data models with variance of unknown form: an application to a hedonic model of worker absenteeism

    Get PDF
    We examine an econometric model of counts of worker absences due to illness in a sluggishly adjusting hedonic labor market. We compare three estimators that parameterize the conditional variance?least squares, Poisson, and negative binomial pseudo maximum likelihood?to generalized least squares (GLS) using nonparametric estimates of the conditional variance. Our data support the hedonic absenteeism model. Semiparametric GLS coefficients are similar in sign, magnitude, and statistical significance to coefficients where the mean and variance of the errors are specified ex ante. In our data, coefficient estimates are sensitive to a regressor list but not to the econometric technique, including correcting for possible heteroskedasticity of unknown form.Publicad

    Does the Gravity Model Suffer from Selection Bias?

    Get PDF
    When analyzing bilateral trade flow data, zero trade flows are quite common and problematic when a gravity equation is estimated with a log-linear functional form. This has caused many researchers to either ignore the zero trade flows or to replace zero with a small positive number. Both of these actions bias the resulting parameter estimates of the gravity equation. In this study we correct for this misspecification by using the Heckman selection model to estimate the bilateral trade flows for 46 agrifood products, for the period 1990 to 2000, for 52 countries. In our sample, selection bias rarely affects the signs of variables but often has a substantial effect on the magnitude, statistical significance and economic interpretation of the marginal effects. Hence, treating zero trade flows properly is important from both a statistical and an economics perspective.Gravity model, selection bias, Agrifood Trade, Heckman Selection Model, marginal effects, Agricultural and Food Policy, Demand and Price Analysis, International Relations/Trade,

    Improved variable selection with Forward-Lasso adaptive shrinkage

    Full text link
    Recently, considerable interest has focused on variable selection methods in regression situations where the number of predictors, pp, is large relative to the number of observations, nn. Two commonly applied variable selection approaches are the Lasso, which computes highly shrunk regression coefficients, and Forward Selection, which uses no shrinkage. We propose a new approach, "Forward-Lasso Adaptive SHrinkage" (FLASH), which includes the Lasso and Forward Selection as special cases, and can be used in both the linear regression and the Generalized Linear Model domains. As with the Lasso and Forward Selection, FLASH iteratively adds one variable to the model in a hierarchical fashion but, unlike these methods, at each step adjusts the level of shrinkage so as to optimize the selection of the next variable. We first present FLASH in the linear regression setting and show that it can be fitted using a variant of the computationally efficient LARS algorithm. Then, we extend FLASH to the GLM domain and demonstrate, through numerous simulations and real world data sets, as well as some theoretical analysis, that FLASH generally outperforms many competing approaches.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS375 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    SPATIAL REGRESSION MODELS FOR YIELD MONITOR DATA: A CASE STUDY FROM ARGENTINA

    Get PDF
    Precision agricultural technology promises to move crop production closer to a manufacturing paradigm, but analysis of yield monitor, sensor and other spatial data has proven difficult because correlation among neighboring observations often violates the assumptions of classical statistical analysis. When spatial structure is ignored variance estimates tend to be inflated and significance levels of test statistics are reduced. The gap between data analysis and site-specific recommendations has been identified as one of the key constraints on widespread adoption of precision agriculture technology. This paper compares four approaches that explicitly incorporate spatial correlation into regression models: (1) a spatial econometric approach; (2) a polynomial trend regression approach; (3) a classical nearest neighbor analysis; and (4) and a geostatistic approach. In the Argentine data studied, the spatial econometric, geostatistical approach and spatial trend analysis offered stronger statistical evidence of spatial heterogeniety of nitrogen response than the ordinary least squares or nearest neighbor analysis. All the spatial models led to the same economic conclusion, which is that variable rate nitrogen is potentially profitable. The spatial econometric analysis can be implemented on relatively small data sets that do not have enough observations for estimation of the semivariogram required by geostatistics. The spatial trend analysis can be implemented with ordinary least squares functions that are already available in some GIS software. In this study, the main benefit of using spatial regression analysis is increased confidence in the corn yield response estimates by management zone, and conclusions about the profitability of precision agriculture technologies.Crop Production/Industries,

    Residual Risk Revisited

    Get PDF
    The Capital Asset Pricing Model in conjunction with the usual market model assumptions implies that well-diversified portfolios should be mean variance efficient and ,hence, betas computed with respect to such indices should completely explain expected returns on individual assets. In fact, there is now a large body of evidence indicating that the market proxies usually employed in empirical tests are not mean variance efficient. Moreover, there is considerable evidence suggesting that these rejections are in part a consequence of the presence of omitted risk factors which are associated with nonzero risk premia in the residuals from the single index market model. Consequently, the idiosyncratic variances from the one factor model should partially reflect exposure to these omitted sources of systematic risk and,hence, should help explain expected returns. There are two plausible explanations for the inability to obtain statistically reliable estimates of a linear residual risk effect in the previous literature:(1) nonlinearity of the residual risk effect and (2) the inadequacy of the statistical procedures employed to measure it.The results presented below indicate that the econometric methods employed previously are the culprits. Pronounced residual risk effects are found in the whole fifty-four year sample and in numerous five year subperiods as well when weighted least squares estimation is coupled with the appropriate corrections for sampling error in the betas and residual variances of individual security returns. In addition, the evidence suggests that it is important to take account of the nonnormality and heteroskedasticity of security returns when making the appropriate measurement error corrections in cross-sectional regressions. Finally, the results are sensitive to the specification of the model for expected returns.
    • 

    corecore