Wireless sensor networks offer the potential to span and monitor large
geographical areas inexpensively. Sensors, however, have significant power
constraint (battery life), making communication very expensive. Another
important issue in the context of sensor-based information systems is that
individual sensor readings are inherently unreliable. In order to address these
two aspects, sensor database systems like TinyDB and Cougar enable in-network
data aggregation to reduce the communication cost and improve reliability. The
existing data aggregation techniques, however, are limited to relatively simple
types of queries such as SUM, COUNT, AVG, and MIN/MAX. In this paper we propose
a data aggregation scheme that significantly extends the class of queries that
can be answered using sensor networks. These queries include (approximate)
quantiles, such as the median, the most frequent data values, such as the
consensus value, a histogram of the data distribution, as well as range
queries. In our scheme, each sensor aggregates the data it has received from
other sensors into a fixed (user specified) size message. We provide strict
theoretical guarantees on the approximation quality of the queries in terms of
the message size. We evaluate the performance of our aggregation scheme by
simulation and demonstrate its accuracy, scalability and low resource
utilization for highly variable input data sets