4,042 research outputs found

    Bias reduction in traceroute sampling: towards a more accurate map of the Internet

    Full text link
    Traceroute sampling is an important technique in exploring the internet router graph and the autonomous system graph. Although it is one of the primary techniques used in calculating statistics about the internet, it can introduce bias that corrupts these estimates. This paper reports on a theoretical and experimental investigation of a new technique to reduce the bias of traceroute sampling when estimating the degree distribution. We develop a new estimator for the degree of a node in a traceroute-sampled graph; validate the estimator theoretically in Erdos-Renyi graphs and, through computer experiments, for a wider range of graphs; and apply it to produce a new picture of the degree distribution of the autonomous system graph.Comment: 12 pages, 3 figure

    One-step Estimation of Networked Population Size: Respondent-Driven Capture-Recapture with Anonymity

    Get PDF
    Population size estimates for hidden and hard-to-reach populations are particularly important when members are known to suffer from disproportion health issues or to pose health risks to the larger ambient population in which they are embedded. Efforts to derive size estimates are often frustrated by a range of factors that preclude conventional survey strategies, including social stigma associated with group membership or members' involvement in illegal activities. This paper extends prior research on the problem of network population size estimation, building on established survey/sampling methodologies commonly used with hard-to-reach groups. Three novel one-step, network-based population size estimators are presented, to be used in the context of uniform random sampling, respondent-driven sampling, and when networks exhibit significant clustering effects. Provably sufficient conditions for the consistency of these estimators (in large configuration networks) are given. Simulation experiments across a wide range of synthetic network topologies validate the performance of the estimators, which are seen to perform well on a real-world location-based social networking data set with significant clustering. Finally, the proposed schemes are extended to allow them to be used in settings where participant anonymity is required. Systematic experiments show favorable tradeoffs between anonymity guarantees and estimator performance. Taken together, we demonstrate that reasonable population estimates can be derived from anonymous respondent driven samples of 250-750 individuals, within ambient populations of 5,000-40,000. The method thus represents a novel and cost-effective means for health planners and those agencies concerned with health and disease surveillance to estimate the size of hidden populations. Limitations and future work are discussed in the concluding section

    Monitoring wild animal communities with arrays of motion sensitive camera traps

    Get PDF
    Studying animal movement and distribution is of critical importance to addressing environmental challenges including invasive species, infectious diseases, climate and land-use change. Motion sensitive camera traps offer a visual sensor to record the presence of a broad range of species providing location -specific information on movement and behavior. Modern digital camera traps that record video present new analytical opportunities, but also new data management challenges. This paper describes our experience with a terrestrial animal monitoring system at Barro Colorado Island, Panama. Our camera network captured the spatio-temporal dynamics of terrestrial bird and mammal activity at the site - data relevant to immediate science questions, and long-term conservation issues. We believe that the experience gained and lessons learned during our year long deployment and testing of the camera traps as well as the developed solutions are applicable to broader sensor network applications and are valuable for the advancement of the sensor network research. We suggest that the continued development of these hardware, software, and analytical tools, in concert, offer an exciting sensor-network solution to monitoring of animal populations which could realistically scale over larger areas and time spans

    Reliable Identification of RFID Tags Using Multiple Independent Reader Sessions

    Full text link
    Radio Frequency Identification (RFID) systems are gaining momentum in various applications of logistics, inventory, etc. A generic problem in such systems is to ensure that the RFID readers can reliably read a set of RFID tags, such that the probability of missing tags stays below an acceptable value. A tag may be missing (left unread) due to errors in the communication link towards the reader e.g. due to obstacles in the radio path. The present paper proposes techniques that use multiple reader sessions, during which the system of readers obtains a running estimate of the probability to have at least one tag missing. Based on such an estimate, it is decided whether an additional reader session is required. Two methods are proposed, they rely on the statistical independence of the tag reading errors across different reader sessions, which is a plausible assumption when e.g. each reader session is executed on different readers. The first method uses statistical relationships that are valid when the reader sessions are independent. The second method is obtained by modifying an existing capture-recapture estimator. The results show that, when the reader sessions are independent, the proposed mechanisms provide a good approximation to the probability of missing tags, such that the number of reader sessions made, meets the target specification. If the assumption of independence is violated, the estimators are still useful, but they should be corrected by a margin of additional reader sessions to ensure that the target probability of missing tags is met.Comment: Presented at IEEE RFID 2009 Conferenc

    One-step estimation of networked population size: Respondent-driven capture-recapture with anonymity

    Get PDF
    Size estimation is particularly important for populations whose members experience disproportionate health issues or pose elevated health risks to the ambient social structures in which they are embedded. Efforts to derive size estimates are often frustrated when the population is hidden or hard-to-reach in ways that preclude conventional survey strategies, as is the case when social stigma is associated with group membership or when group members are involved in illegal activities. This paper extends prior research on the problem of network population size estimation, building on established survey/sampling methodologies commonly used with hard-to-reach groups. Three novel one-step, network-based population size estimators are presented, for use in the context of uniform random sampling, respondent-driven sampling, and when networks exhibit significant clustering effects. We give provably sufficient conditions for the consistency of these estimators in large configuration networks. Simulation experiments across a wide range of synthetic network topologies validate the performance of the estimators, which also perform well on a real-world location based social networking data set with significant clustering. Finally, the proposed schemes are extended to allow them to be used in settings where participant anonymity is required. Systematic experiments show favorable trade-offs between anonymity guarantees and estimator performance. Taken together, we demonstrate that reasonable population size estimates are derived from anonymous respondent driven samples of 250-750 individuals, within ambient populations of 5,000-40,000. The method thus represents a novel and cost-effective means for health planners and those agencies concerned with health and disease surveillance to estimate the size of hidden populations. We discuss limitations and future work in the concluding section
    • …
    corecore