3 research outputs found
Methods for Combining Probability and Nonprobability Samples Under Unknown Overlaps
Nonprobability (convenience) samples are increasingly sought to stabilize
estimations for one or more population variables of interest that are performed
using a randomized survey (reference) sample by increasing the effective sample
size. Estimation of a population quantity derived from a convenience sample
will typically result in bias since the distribution of variables of interest
in the convenience sample is different from the population. A recent set of
approaches estimates conditional (on sampling design predictors) inclusion
probabilities for convenience sample units by specifying reference
sample-weighted pseudo likelihoods. This paper introduces a novel approach that
derives the propensity score for the observed sample as a function of
conditional inclusion probabilities for the reference and convenience samples
as our main result. Our approach allows specification of an exact likelihood
for the observed sample. We construct a Bayesian hierarchical formulation that
simultaneously estimates sample propensity scores and both conditional and
reference sample inclusion probabilities for the convenience sample units. We
compare our exact likelihood with the pseudo likelihoods in a Monte Carlo
simulation study.Comment: 32 pages, 8 figure
Methods for combining probability and nonprobability samples under unknown overlaps
Nonprobability (convenience) samples are increasingly sought to reduce the estimation variance for one or more population variables of interest that are estimated using a randomized survey (reference) sample by increasing the effective sample size. Estimation of a population quantity derived from a convenience sample will typically result in bias since the distribution of variables of interest in the convenience sample is different from the population distribution. A recent set of approaches estimates inclusion probabilities for convenience sample units by specifying reference sample-weighted pseudo likelihoods. This paper introduces a novel approach that derives the propensity score for the observed sample as a function of inclusion probabilities for the reference and convenience samples as our main result. Our approach allows specification of a likelihood directly for the observed sample as opposed to the approximate or pseudo likelihood. We construct a Bayesian hierarchical formulation that simultaneously estimates sample propensity scores and the convenience sample inclusion probabilities. We use a Monte Carlo simulation study to compare our likelihood based results with the pseudo likelihood based approaches considered in the literature