As networks continue to grow in size and complexity, distributed network monitoring and resource querying are becoming increasingly difficult and costly. We have designed, built, and evaluated a scalable infrastructure for answering queries over distributed measurements, while reducing costs (in terms of both network traffic and query latency) and maximizing precision of results. In this infrastructure, each network node owns a set of numerical measurement values and actively maintains bounds on these values cached at other nodes. We can then answer queries approximately, using bounds from nearby caches to avoid contacting the owners directly. We argue that approximate results are acceptable for our target applications, as long as errors are quantified precisely and reported to the user, and there is a mechanism for the user to obtain results with a specified precision. We have designed, implemented, and evaluated two approaches: One, called QONCH-1, uses a recursive partitioning of the network space to place caches in a static, controlled manner, while the other, called QONCH-2, uses a locality-aware distributed hash table to place caches in a dynamic and decentralized manner. We use large-scale network emulation to demonstrate that our techniques are very effective in reducing query costs while generating an acceptable amount of background traffic. They are also able to exploit various forms of locality that are naturally present in queries, and adapt to volatility of measurements. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.