4,042 research outputs found
Bias reduction in traceroute sampling: towards a more accurate map of the Internet
Traceroute sampling is an important technique in exploring the internet
router graph and the autonomous system graph. Although it is one of the primary
techniques used in calculating statistics about the internet, it can introduce
bias that corrupts these estimates. This paper reports on a theoretical and
experimental investigation of a new technique to reduce the bias of traceroute
sampling when estimating the degree distribution. We develop a new estimator
for the degree of a node in a traceroute-sampled graph; validate the estimator
theoretically in Erdos-Renyi graphs and, through computer experiments, for a
wider range of graphs; and apply it to produce a new picture of the degree
distribution of the autonomous system graph.Comment: 12 pages, 3 figure
One-step Estimation of Networked Population Size: Respondent-Driven Capture-Recapture with Anonymity
Population size estimates for hidden and hard-to-reach populations are
particularly important when members are known to suffer from disproportion
health issues or to pose health risks to the larger ambient population in which
they are embedded. Efforts to derive size estimates are often frustrated by a
range of factors that preclude conventional survey strategies, including social
stigma associated with group membership or members' involvement in illegal
activities.
This paper extends prior research on the problem of network population size
estimation, building on established survey/sampling methodologies commonly used
with hard-to-reach groups. Three novel one-step, network-based population size
estimators are presented, to be used in the context of uniform random sampling,
respondent-driven sampling, and when networks exhibit significant clustering
effects. Provably sufficient conditions for the consistency of these estimators
(in large configuration networks) are given. Simulation experiments across a
wide range of synthetic network topologies validate the performance of the
estimators, which are seen to perform well on a real-world location-based
social networking data set with significant clustering. Finally, the proposed
schemes are extended to allow them to be used in settings where participant
anonymity is required. Systematic experiments show favorable tradeoffs between
anonymity guarantees and estimator performance.
Taken together, we demonstrate that reasonable population estimates can be
derived from anonymous respondent driven samples of 250-750 individuals, within
ambient populations of 5,000-40,000. The method thus represents a novel and
cost-effective means for health planners and those agencies concerned with
health and disease surveillance to estimate the size of hidden populations.
Limitations and future work are discussed in the concluding section
Monitoring wild animal communities with arrays of motion sensitive camera traps
Studying animal movement and distribution is of critical importance to
addressing environmental challenges including invasive species, infectious
diseases, climate and land-use change. Motion sensitive camera traps offer a
visual sensor to record the presence of a broad range of species providing
location -specific information on movement and behavior. Modern digital camera
traps that record video present new analytical opportunities, but also new data
management challenges. This paper describes our experience with a terrestrial
animal monitoring system at Barro Colorado Island, Panama. Our camera network
captured the spatio-temporal dynamics of terrestrial bird and mammal activity
at the site - data relevant to immediate science questions, and long-term
conservation issues. We believe that the experience gained and lessons learned
during our year long deployment and testing of the camera traps as well as the
developed solutions are applicable to broader sensor network applications and
are valuable for the advancement of the sensor network research. We suggest
that the continued development of these hardware, software, and analytical
tools, in concert, offer an exciting sensor-network solution to monitoring of
animal populations which could realistically scale over larger areas and time
spans
Reliable Identification of RFID Tags Using Multiple Independent Reader Sessions
Radio Frequency Identification (RFID) systems are gaining momentum in various
applications of logistics, inventory, etc. A generic problem in such systems is
to ensure that the RFID readers can reliably read a set of RFID tags, such that
the probability of missing tags stays below an acceptable value. A tag may be
missing (left unread) due to errors in the communication link towards the
reader e.g. due to obstacles in the radio path. The present paper proposes
techniques that use multiple reader sessions, during which the system of
readers obtains a running estimate of the probability to have at least one tag
missing. Based on such an estimate, it is decided whether an additional reader
session is required. Two methods are proposed, they rely on the statistical
independence of the tag reading errors across different reader sessions, which
is a plausible assumption when e.g. each reader session is executed on
different readers. The first method uses statistical relationships that are
valid when the reader sessions are independent. The second method is obtained
by modifying an existing capture-recapture estimator. The results show that,
when the reader sessions are independent, the proposed mechanisms provide a
good approximation to the probability of missing tags, such that the number of
reader sessions made, meets the target specification. If the assumption of
independence is violated, the estimators are still useful, but they should be
corrected by a margin of additional reader sessions to ensure that the target
probability of missing tags is met.Comment: Presented at IEEE RFID 2009 Conferenc
One-step estimation of networked population size: Respondent-driven capture-recapture with anonymity
Size estimation is particularly important for populations whose members experience disproportionate health issues or pose elevated health risks to the ambient social structures in which they are embedded. Efforts to derive size estimates are often frustrated when the population is hidden or hard-to-reach in ways that preclude conventional survey strategies, as is the case when social stigma is associated with group membership or when group members are involved in illegal activities. This paper extends prior research on the problem of network population size estimation, building on established survey/sampling methodologies commonly used with hard-to-reach groups. Three novel one-step, network-based population size estimators are presented, for use in the context of uniform random sampling, respondent-driven sampling, and when networks exhibit significant clustering effects. We give provably sufficient conditions for the consistency of these estimators in large configuration networks. Simulation experiments across a wide range of synthetic network topologies validate the performance of the estimators, which also perform well on a real-world location based social networking data set with significant clustering. Finally, the proposed schemes are extended to allow them to be used in settings where participant anonymity is required. Systematic experiments show favorable trade-offs between anonymity guarantees and estimator performance. Taken together, we demonstrate that reasonable population size estimates are derived from anonymous respondent driven samples of 250-750 individuals, within ambient populations of 5,000-40,000. The method thus represents a novel and cost-effective means for health planners and those agencies concerned with health and disease surveillance to estimate the size of hidden populations. We discuss limitations and future work in the concluding section
- …