1 research outputs found
Frequent Elements with Witnesses in Data Streams
Detecting frequent elements is among the oldest and most-studied problems in
the area of data streams. Given a stream of data items in , the objective is to output items that appear at least times, for some
threshold parameter , and provably optimal algorithms are known today.
However, in many applications, knowing only the frequent elements themselves is
not enough: For example, an Internet router may not only need to know the most
frequent destination IP addresses of forwarded packages, but also the
timestamps of when these packages appeared or any other meta-data that
"arrived" with the packages, e.g., their source IP addresses.
In this paper, we introduce the witness version of the frequent elements
problem: Given a desired approximation guarantee and a desired
frequency , where is the frequency of the most frequent
item, the objective is to report an item together with at least
timestamps of when the item appeared in the stream (or any other meta-data that
arrived with the items). We give provably optimal algorithms for both the
insertion-only and insertion-deletion stream settings: In insertion-only
streams, we show that space is
necessary and sufficient for every integral . In
insertion-deletion streams, we show that space is necessary and sufficient, for every .Comment: Fixed the statement of Lemma 5.1, introduction update