2 research outputs found

    Ant-Inspired Density Estimation via Random Walks

    Full text link
    Many ant species employ distributed population density estimation in applications ranging from quorum sensing [Pra05], to task allocation [Gor99], to appraisal of enemy colony strength [Ada90]. It has been shown that ants estimate density by tracking encounter rates -- the higher the population density, the more often the ants bump into each other [Pra05,GPT93]. We study distributed density estimation from a theoretical perspective. We prove that a group of anonymous agents randomly walking on a grid are able to estimate their density within a small multiplicative error in few steps by measuring their rates of encounter with other agents. Despite dependencies inherent in the fact that nearby agents may collide repeatedly (and, worse, cannot recognize when this happens), our bound nearly matches what would be required to estimate density by independently sampling grid locations. From a biological perspective, our work helps shed light on how ants and other social insects can obtain relatively accurate density estimates via encounter rates. From a technical perspective, our analysis provides new tools for understanding complex dependencies in the collision probabilities of multiple random walks. We bound the strength of these dependencies using localĀ mixingĀ propertieslocal\ mixing\ properties of the underlying graph. Our results extend beyond the grid to more general graphs and we discuss applications to size estimation for social networks and density estimation for robot swarms

    Scalable Methods and Algorithms for Very Large Graphs Based on Sampling

    Get PDF
    Analyzing real-life networks is a computationally intensive task due to the sheer size of networks. Direct analysis is even impossible when the network data is not entirely accessible. For instance, user networks in Twitter and Facebook are not available for third parties to explore their properties directly. Thus, sampling-based algorithms are indispensable. This dissertation addresses the conļ¬dence interval (CI) and bias problems in real-world network analysis. It uses estimations of the number of triangles (hereafter āˆ†) and clustering coefficient (hereafter C) as a case study. Metric āˆ† in a graph is an important measurement for understanding the graph. It is also directly related to C in a graph, which is one of the most important indicators for social networks. The methods proposed in this dissertation can be utilized in other sampling problems. First, we proposed two new methods to estimate āˆ† based on random edge sampling in both streaming and non-streaming models. These methods outperformed the state-of-the-art methods consistently and could be better by orders of magnitude when the graph is very large. More importantly, we proved the improvement ratio analytically and veriļ¬ed our result extensively in real-world networks. The analytical results were achieved by simplifying the variances of the estimators based on the assumption that the graph is very large. We believe that such big data assumption can lead to interesting results not only in triangle estimation but also in other sampling problems. Next, we studied the estimation of C in both streaming and non-streaming sampling models. Despite numerous algorithms proposed in this area, the bias and variance of the estimators remain an open problem. We quantiļ¬ed the bias using Taylor expansion and found that the bias can be determined by the structure of the sampled data. Based on the understanding of the bias, we gave new estimators that correct the bias. The results were derived analytically and veriļ¬ed in 56 real networks ranging in diļ¬€erent sizes and structures. The experiments reveal that the bias ranges widely from data to data. The relative bias can be as high as 4% in non-streaming model and 2% in streaming model, or it can be negative. We also derived the variances of the estimators, and the estimators for the variances. Our simpliļ¬ed estimators can be used in practice to control the accuracy level of estimations
    corecore