603 research outputs found

    A simple variance estimator for unequal probability sampling without replacement

    No full text
    Survey sampling textbooks often refer to the Sen-Yates-Grundy variance estimator for use with without replacement unequal probability designs. This estimator is rarely implemented, because of the complexity of determining joint inclusion probabilities. In practice, the variance is usually estimated by simpler variance estimators such as the Hansen-Hurwitz with replacement variance estimator; which often leads to overestimation of the variance for large sampling fraction that are common in business surveys. We will consider an alternative estimator: the Hájek (1964) variance estimator that depends on the first-order inclusion probabilities only and is usually more accurate than the Hansen-Hurwitz estimator. We review this estimator and show its practical value. We propose a simple alternative expression; which is as simple as the Hansen-Hurwitz estimator. We also show how the Hájek estimator can be easily implemented with standard statistical packages

    Size constrained unequal probability sampling with a non-integer sum of inclusion probabilities

    Get PDF
    More than 50 methods have been developed to draw unequal probability samples with fixed sample size. All these methods require the sum of the inclusion probabilities to be an integer number. There are cases, however, where the sum of desired inclusion probabilities is not an integer. Then, classical algorithms for drawing samples cannot be directly applied. We present two methods to overcome the problem of sample selection with unequal inclusion probabilities when their sum is not an integer and the sample size cannot be fixed. The first one consists in splitting the inclusion probability vector. The second method is based on extending the population with a phantom unit. For both methods the sample size is almost fixed, and equal to the integer part of the sum of the inclusion probabilities or this integer plus one

    On Estimation of Variance in Unequal Probability Sampling

    Get PDF
    1 online resource (PDF, 38 pages

    Inference under unequal probability sampling with the Bayesian exponentially tilted empirical likelihood.

    Get PDF
    Fully Bayesian inference in the presence of unequal probability sampling requires stronger structural assumptions on the data-generating distribution than frequentist semiparametric methods, but offers the potential for improved small-sample inference and convenient evidence synthesis. We demonstrate that the Bayesian exponentially tilted empirical likelihood can be used to combine the practical benefits of Bayesian inference with the robustness and attractive large-sample properties of frequentist approaches. Estimators defined as the solutions to unbiased estimating equations can be used to define a semiparametric model through the set of corresponding moment constraints. We prove Bernstein-von Mises theorems which show that the posterior constructed from the resulting exponentially tilted empirical likelihood becomes approximately normal, centred at the chosen estimator with matching asymptotic variance; thus, the posterior has properties analogous to those of the estimator, such as double robustness, and the frequentist coverage of any credible set will be approximately equal to its credibility. The proposed method can be used to obtain modified versions of existing estimators with improved properties, such as guarantees that the estimator lies within the parameter space. Unlike existing Bayesian proposals, our method does not prescribe a particular choice of prior or require posterior variance correction, and simulations suggest that it provides superior performance in terms of frequentist criteria.MR

    On Inadmissibility of Variance Estimators in Unequal Probability Sampling

    Get PDF
    1 online resource (PDF, 13 pages

    Unequal Probability Sampling in Active Learning and Traffic Safety

    Get PDF
    This thesis addresses a problem arising in large and expensive experiments where incomplete data come in abundance but statistical analyses require collection of additional information, which is costly. Out of practical and economical considerations, it is necessary to restrict the analysis to a subset of the original database, which inevitably will cause a loss of valuable information; thus, choosing this subset in a manner that captures as much of the available information as possible is essential.Using finite population sampling methodology, we address the issue of appropriate subset selection. We show how sample selection may be optimised to maximise precision in estimating various parameters and quantities of interest, and extend the existing finite population sampling methodology to an adaptive, sequential sampling framework, where information required for sample scheme optimisation may be updated iteratively as more data is collected. The implications of model misspecification are discussed, and the robustness of the finite population sampling methodology against model misspecification is highlighted. The proposed methods are illustrated and evaluated on two problems: on subset selection for optimal prediction in active learning (Paper I), and on optimal control sampling for analysis of safety critical events in naturalistic driving studies (Paper II). It is demonstrated that the use of optimised sample selection may reduce the number of records for which complete information needs to be collected by as much as 50%, compared to conventional methods and uniform random sampling
    • …
    corecore