Search CORE

379 research outputs found

Mining Frequent Item sets in Data Streams

Author: Dass Rajanish
Publication venue
Publication date
Field of study

Fast and Accurate Mining of Correlated Heavy Hitters

Author: Cafaro Massimo
Epicoco Italo
Pulimeno Marco
Publication venue
Publication date: 06/04/2017
Field of study

The problem of mining Correlated Heavy Hitters (CHH) from a two-dimensional data stream has been introduced recently, and a deterministic algorithm based on the use of the Misra--Gries algorithm has been proposed by Lahiri et al. to solve it. In this paper we present a new counter-based algorithm for tracking CHHs, formally prove its error bounds and correctness and show, through extensive experimental results, that our algorithm outperforms the Misra--Gries based algorithm with regard to accuracy and speed whilst requiring asymptotically much less space

arXiv.org e-Print Archive

Archivio Istituzionale della Ricerca- Università del Salento

State-of-the-art in data stream mining

Author: Gaber M.
Gama J.
Publication venue
Publication date: 17/09/2007
Field of study

Portsmouth University Research Portal (Pure)

Mining Recent Frequent Itemsets in Sliding Windows over Data Streams

Author: Han Congying
He Guoping
Xu Lijun
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 27/01/2012
Field of study

This paper considers the problem of mining recent frequent itemsets over data streams. As the data grows without limit at a rapid rate, it is hard to track the new changes of frequent itemsets over data streams. We propose an efficient one-pass algorithm in sliding windows over data streams with an error bound guarantee. This algorithm does not need to refer to obsolete transactions when they are removed from the sliding window. It exploits a compact data structure to maintain potentially frequent itemsets so that it can output recent frequent itemsets at any time. Flexible queries for continuous transactions in the sliding window can be answered with an error bound guarantee

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Catch the moment: maintaining closed frequent itemsets over a data stream sliding window

Author
Publication venue: Springer
Publication date: 01/10/2006
Field of study

Springer - Publisher Connector

Max-FISM: Mining (recently) maximal frequent itemsets over data streams using the sliding window model

Author: Cercone Nick
Farzanyar Zahra
Kangavari Mohammadreza
Publication venue: 'Elsevier BV'
Publication date: 30/09/2012
Field of study

AbstractFrequent itemset mining from data streams is an important data mining problem with broad applications such as retail market data analysis, network monitoring, web usage mining, and stock market prediction. However, it is also a difficult problem due to the unbounded, high-speed and continuous characteristics of streaming data. Therefore, extracting frequent itemsets from more recent data can enhance the analysis of stream data. In this paper, we propose an efficient algorithm, called Max-FISM (Maximal-Frequent Itemsets Mining), for mining recent maximal frequent itemsets from a high-speed stream of transactions within a sliding window. According to our algorithm, whenever a new transaction is inserted in the current window only its maximum itemset should be inserted into a prefix tree-based summary data structure called Max-Set for maintaining the number of independent appearance of each transaction in the current window. Finally, the set of recent maximal frequent itemsets is obtained from the current Max-Set. Experimental studies show that the proposed Max-FISM algorithm is highly efficient in terms of memory and time complexity for mining recent maximal frequent itemsets over high-speed data streams

Elsevier - Publisher Connector

Data Stream Mining: A Review on Windowing Approach

Author: Mr. Pramod S.
Publication venue: Global Journals Inc. (US)
Publication date: 07/06/2012
Field of study

In the data stream model the data arrive at high speed so that the algorithms used for mining the data streams must process them in very strict constraints of space and time. This raises new issues that need to be considered when developing association rule mining algorithms for data streams. So it is important to study the existing stream mining algorithms to open up the challenges and the research scope for the new researchers. In this paper we are discussing different type windowing techniques and the important algorithms available in this mining process

Global Journal of Computer Science and Technology (GJCST)