23,399 research outputs found

    Fast and Accurate Mining of Correlated Heavy Hitters

    Full text link
    The problem of mining Correlated Heavy Hitters (CHH) from a two-dimensional data stream has been introduced recently, and a deterministic algorithm based on the use of the Misra--Gries algorithm has been proposed by Lahiri et al. to solve it. In this paper we present a new counter-based algorithm for tracking CHHs, formally prove its error bounds and correctness and show, through extensive experimental results, that our algorithm outperforms the Misra--Gries based algorithm with regard to accuracy and speed whilst requiring asymptotically much less space

    Optimal Elephant Flow Detection

    Full text link
    Monitoring the traffic volumes of elephant flows, including the total byte count per flow, is a fundamental capability for online network measurements. We present an asymptotically optimal algorithm for solving this problem in terms of both space and time complexity. This improves on previous approaches, which can only count the number of packets in constant time. We evaluate our work on real packet traces, demonstrating an up to X2.5 speedup compared to the best alternative.Comment: Accepted to IEEE INFOCOM 201

    Simple and Deterministic Matrix Sketching

    Full text link
    We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix A∈Rn×mA \in \R^{n \times m} one after the other in a streaming fashion. It maintains a sketch matrix B \in \R^ {1/\eps \times m} such that for any unit vector xx [\|Ax\|^2 \ge \|Bx\|^2 \ge \|Ax\|^2 - \eps \|A\|_{f}^2 \.] Sketch updates per row in AA require O(m/\eps^2) operations in the worst case. A slight modification of the algorithm allows for an amortized update time of O(m/\eps) operations per row. The presented algorithm stands out in that it is: deterministic, simple to implement, and elementary to prove. It also experimentally produces more accurate sketches than widely used approaches while still being computationally competitive
    • …
    corecore