Surveillance research is of great importance for effective and efficient
epidemiological monitoring of case counts and disease prevalence. Taking
specific motivation from ongoing efforts to identify recurrent cases based on
the Georgia Cancer Registry, we extend recently proposed "anchor stream"
sampling design and estimation methodology. Our approach offers a more
efficient and defensible alternative to traditional capture-recapture (CRC)
methods by leveraging a relatively small random sample of participants whose
recurrence status is obtained through a principled application of medical
records abstraction. This sample is combined with one or more existing
signaling data streams, which may yield data based on arbitrarily
non-representative subsets of the full registry population. The key extension
developed here accounts for the common problem of false positive or negative
diagnostic signals from the existing data stream(s). In particular, we show
that the design only requires documentation of positive signals in these
non-anchor surveillance streams, and permits valid estimation of the true case
count based on an estimable positive predictive value (PPV) parameter. We
borrow ideas from the multiple imputation paradigm to provide accompanying
standard errors, and develop an adapted Bayesian credible interval approach
that yields favorable frequentist coverage properties. We demonstrate the
benefits of the proposed methods through simulation studies, and provide a data
example targeting estimation of the breast cancer recurrence case count among
Metro Atlanta area patients from the Georgia Cancer Registry-based Cancer
Recurrence Information and Surveillance Program (CRISP) database