Sample coordination, where similar instances have similar samples, was
proposed by statisticians four decades ago as a way to maximize overlap in
repeated surveys. Coordinated sampling had been since used for summarizing
massive data sets.
The usefulness of a sampling scheme hinges on the scope and accuracy within
which queries posed over the original data can be answered from the sample. We
aim here to gain a fundamental understanding of the limits and potential of
coordination. Our main result is a precise characterization, in terms of simple
properties of the estimated function, of queries for which estimators with
desirable properties exist. We consider unbiasedness, nonnegativity, finite
variance, and bounded estimates.
Since generally a single estimator can not be optimal (minimize variance
simultaneously) for all data, we propose {\em variance competitiveness}, which
means that the expectation of the square on any data is not too far from the
minimum one possible for the data. Surprisingly perhaps, we show how to
construct, for any function for which an unbiased nonnegative estimator exists,
a variance competitive estimator.Comment: 4 figures, 21 pages, Extended Abstract appeared in RANDOM 201