2 research outputs found
Mining frequent patterns from dynamic data streams with data load management
In this paper, we study the practical problem of frequent-itemset discovery in data-stream environments which may suffer from data overload. The main issues include frequent-pattern mining and data-overload handling. Therefore, a mining algorithm together with two dedicated overload-handling mechanisms is proposed. The algorithm extracts basic information from streaming data and keeps the information in its data structure. The mining task is accomplished when requested by calculating the approximate counts of itemsets and then returning the frequent ones. When there exists data overload, one of the two mechanisms is executed to settle the overload by either improving system throughput or shedding data load. From the experimental data, we find that our mining algorithm is efficient and possesses good accuracy. More importantly, it could effectively manage data overload with the overload-handling mechanisms. Our research results may lead to a feasible solution for frequent-pattern mining in dynamic data streams. (C) 2012 Elsevier Inc. All rights reserved
Mining frequent patterns from dynamic data streams with data load management
In this paper, we study the practical problem of frequent-itemset discovery in data-stream environments
which may suffer from data overload. The main issues include frequent-pattern mining and data-overload
handling. Therefore, a mining algorithm together with two dedicated overload-handling mechanisms
is proposed. The algorithm extracts basic information from streaming data and keeps the information
in its data structure. The mining task is accomplished when requested by calculating the approximate
counts of itemsets and then returning the frequent ones. When there exists data overload, one of the
two mechanisms is executed to settle the overload by either improving system throughput or shedding
data load. From the experimental data, we find that our mining algorithm is efficient and possesses
good accuracy. More importantly, it could effectively manage data overload with the overload-handling
mechanisms. Our research results may lead to a feasible solution for frequent-pattern mining in dynamic
data streams