258 research outputs found

    Endogenous post-stratification in surveys: classifying with a sample-fitted model

    Full text link
    Post-stratification is frequently used to improve the precision of survey estimators when categorical auxiliary information is available from sources outside the survey. In natural resource surveys, such information is often obtained from remote sensing data, classified into categories and displayed as pixel-based maps. These maps may be constructed based on classification models fitted to the sample data. Post-stratification of the sample data based on categories derived from the sample data (``endogenous post-stratification'') violates the standard post-stratification assumptions that observations are classified without error into post-strata, and post-stratum population counts are known. Properties of the endogenous post-stratification estimator are derived for the case of a sample-fitted generalized linear model, from which the post-strata are constructed by dividing the range of the model predictions into predetermined intervals. Design consistency of the endogenous post-stratification estimator is established under mild conditions. Under a superpopulation model, consistency and asymptotic normality of the endogenous post-stratification estimator are established, showing that it has the same asymptotic variance as the traditional post-stratified estimator with fixed strata. Simulation experiments demonstrate that the practical effect of first fitting a model to the survey data before post-stratifying is small, even for relatively small sample sizes.Comment: Published in at http://dx.doi.org/10.1214/009053607000000703 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Rank-based estimation for all-pass time series models

    Full text link
    An autoregressive-moving average model in which all roots of the autoregressive polynomial are reciprocals of roots of the moving average polynomial and vice versa is called an all-pass time series model. All-pass models are useful for identifying and modeling noncausal and noninvertible autoregressive-moving average processes. We establish asymptotic normality and consistency for rank-based estimators of all-pass model parameters. The estimators are obtained by minimizing the rank-based residual dispersion function given by Jaeckel [Ann. Math. Statist. 43 (1972) 1449--1458]. These estimators can have the same asymptotic efficiency as maximum likelihood estimators and are robust. The behavior of the estimators for finite samples is studied via simulation and rank estimation is used in the deconvolution of a simulated water gun seismogram.Comment: Published at http://dx.doi.org/10.1214/009053606000001316 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The detection and estimation of long memory in stochastic volatility

    Get PDF
    We propose a new time series representation of persistence in conditional variance called a long memory stochastic volatility (LMSV) model. The LMSV model is constructed by incorporating an ARFIMA process in a standard stochastic volatility scheme. Strongly consistent estimators of the parameters of the model are obtained by maximizing the spectral approximation to the Gaussian likelihood. The finite sample properties of the spectral likelihood estimator are analyzed by means of a Monte Carlo study. An empirical example with a long time series of stock prices demonstrates the superiority of the LMSV model over existing (short-memory) volatility models.info:eu-repo/semantics/publishedVersio

    Sampling Schemes for Policy Analyses Using Computer Simulation Experiments

    Get PDF
    Evaluating the environmental and Economic impacts of agricultural policies is not a simple task. A systematic approach to evaluation would include the effect of policy-dependent factors (such as tillage practices, crop rotations, and chemical use) as well as the effect of policy independent covariates (such as weather, topography, and soil attributes) on response variables (such as amount of soil eroded or chemical leached into the groundwater). For comparison purposes, the effects of these input combinations on the response variable would have to be assessed under competing policy scenarios. Because the number of input combinations is high in most problems, and because policies to be evaluated are often not in use at the time of the study, practitioners have resorted to simulation experiments to generate data. But generating data from simulation models is often costly and time consuming; thus, the number of input combinations in a study may be limiting even in simulation experiments. In this paper, we discuss the problem of designing computer simulation experiments that require generating data for just a fraction of the possible input combinations. We propose an approach that is based on subsampling the 1992 National Resources Inventory (NRI) points. We illustrate the procedure by assessing soil erosion in a situation where there are observed data (reported by the Natural Resources Conservation Service (NRCS)) for comparison. Estimates for soil erosion obtained using the procedure we propose are in good agreement with NRCS reported values

    Optimal Information Acquisition Under a Geostatistical Model

    Get PDF
    Studies examining the value of switching to a variable rate technology (VRT) fertilizer program assume producers possess perfect soil nitrate information. In reality, producers estimate soil nitrate levels with soil sampling. The value of switching to a VRT program depends on the quality of the estimates and on how the estimates are used. Larger sample sizes, increased spatial correlation, and decreased variability improve the estimates and increase returns. Fertilizing strictly to the estimated field map fails to account for estimation risk. Returns increase if the soil sample information is used in a Bayesian fashion to update the soil nitrate beliefs in nonsampled sites

    Single-Index Model-Assisted Estimation In Survey Sampling

    Full text link
    A model-assisted semiparametric method of estimating finite population totals is investigated to improve the precision of survey estimators by incorporating multivariate auxiliary information. The proposed superpopulation model is a single-index model which has proven to be a simple and efficient semiparametric tool in multivariate regression. A class of estimators based on polynomial spline regression is proposed. These estimators are robust against deviation from single-index models. Under standard design conditions, the proposed estimators are asymptotically design-unbiased, consistent and asymptotically normal. An iterative optimization routine is provided that is sufficiently fast for users to analyze large and complex survey data within seconds. The proposed method has been applied to simulated datasets and MU281 dataset, which have provided strong evidence that corroborates with the asymptotic theory.Comment: 30 page
    corecore