Location of Repository

Abstract. We consider the problem of identifying periodic trends in data streams. We say a signal a ∈ R n is p-periodic if ai = ai+p for all i ∈ [n − p]. Recently, Ergün et al. [4] presented a one-pass, O(polylog n)space algorithm for identifying the smallest period of a signal. Their algorithm required a to be presented in the time-series model, i.e., ai is the ith element in the stream. We present a more general linear sketch algorithm that has the advantages of being applicable to a) the turnstile stream model, where coordinates can be incremented/decremented in an arbitrary fashion and b) the parallel or distributed setting where the signal is distributed over multiple locations/machines. We also present sketches for (1+ɛ) approximating the ℓ2 distance between a and the nearest p-periodic signal for a given p. Our algorithm uses O(ɛ −2 polylog n) space, comparing favorably to an earlier time-series result that used O(ɛ −5.5√ p polylog n) space for estimating the Hamming distance to the nearest p-periodic signal. Our last periodicity result is an algorithm for estimating the periodicity of a sequence in the presence of noise. We conclude with a small-space algorithm for identifying when two signals are exact (or nearly) cyclic shifts of one another. Our algorithms are based on bilinear sketches [10] and combining Fourier transforms with stream processing techniques such as ℓp sampling and sketching [11, 13].

Year: 2011

OAI identifier:
oai:CiteSeerX.psu:10.1.1.352.6098

Provided by:
CiteSeerX

Download PDF:To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.