Search CORE

35 research outputs found

Visualisation of cluster dynamics and change detection in ubiquitous data stream mining

Author: Gaber M.
Gillick B.
Krishnaswamy S.
Zaslavsky A.
Publication venue
Publication date: 29/06/2006
Field of study

Portsmouth University Research Portal (Pure)

A fuzzy approach for interpretation and application of ubiquitous data stream clustering

Author: Gaber M.
Horovitz O.
Krishnaswamy S.
Publication venue
Publication date: 07/10/2005
Field of study

Portsmouth University Research Portal (Pure)

Efficient Summing over Sliding Windows

Author: Basat Ran Ben
Einziger Gil
Friedman Roy
Kassner Yaron
Publication venue
Publication date: 01/01/2016
Field of study

This paper considers the problem of maintaining statistic aggregates over the last W elements of a data stream. First, the problem of counting the number of 1's in the last W bits of a binary stream is considered. A lower bound of {\Omega}(1/{\epsilon} + log W) memory bits for W{\epsilon}-additive approximations is derived. This is followed by an algorithm whose memory consumption is O(1/{\epsilon} + log W) bits, indicating that the algorithm is optimal and that the bound is tight. Next, the more general problem of maintaining a sum of the last W integers, each in the range of {0,1,...,R}, is addressed. The paper shows that approximating the sum within an additive error of RW{\epsilon} can also be done using {\Theta}(1/{\epsilon} + log W) bits for {\epsilon}={\Omega}(1/W). For {\epsilon}=o(1/W), we present a succinct algorithm which uses B(1 + o(1)) bits, where B={\Theta}(Wlog(1/W{\epsilon})) is the derived lower bound. We show that all lower bounds generalize to randomized algorithms as well. All algorithms process new elements and answer queries in O(1) worst-case time.Comment: A shorter version appears in SWAT 201

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Adaptive Mining Techniques for Data Streams Using Algorithm Output Granularity Mohamed

Author: Arkady Zaslavsky
Mohamed Medhat Gaber
Shonali Krishnaswamy
Publication venue: Springer Verlag
Publication date: 01/01/2003
Field of study

Mining data streams is an emerging area of research given the potentially large number of business and scientific applications. A significant challenge in analyzing /mining data streams is the high data rate of the stream. In this paper, we propose a novel approach to cope with the high data rate of incoming data streams. We termed our approach "algorithm output granularity". It is a resource-aware approach that is adaptable to available memory, time constraints, and data stream rate. The approach is generic and applicable to clustering, classification and counting frequent items mining techniques. We have developed a data stream clustering algorithm based on the algorithm output granularity approach. We present this algorithm and discuss its implementation and empirical evaluation. The experiments show acceptable accuracy accompanied with run-time efficiency. They show that the proposed algorithm outperforms the K-means in terms of running time while preserving the accuracy that our algorithm can achieve

CiteSeerX

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Luleå University of Technology Publications

Clustering Parallel Data Streams

Author: Yixin Chen
Publication venue: 'IntechOpen'
Publication date: 01/01/2009
Field of study

IntechOpen

CiteSeerX