238,250 research outputs found
Efficient Analysis of Pattern and Association Rule Mining Approaches
The process of data mining produces various patterns from a given data
source. The most recognized data mining tasks are the process of discovering
frequent itemsets, frequent sequential patterns, frequent sequential rules and
frequent association rules. Numerous efficient algorithms have been proposed to
do the above processes. Frequent pattern mining has been a focused topic in
data mining research with a good number of references in literature and for
that reason an important progress has been made, varying from performant
algorithms for frequent itemset mining in transaction databases to complex
algorithms, such as sequential pattern mining, structured pattern mining,
correlation mining. Association Rule mining (ARM) is one of the utmost current
data mining techniques designed to group objects together from large databases
aiming to extract the interesting correlation and relation among huge amount of
data. In this article, we provide a brief review and analysis of the current
status of frequent pattern mining and discuss some promising research
directions. Additionally, this paper includes a comparative study between the
performance of the described approaches.Comment: 14 pages, 3 figures. arXiv admin note: text overlap with
arXiv:1312.4800; and with arXiv:1109.2427 by other author
Reductions for Frequency-Based Data Mining Problems
Studying the computational complexity of problems is one of the - if not the
- fundamental questions in computer science. Yet, surprisingly little is known
about the computational complexity of many central problems in data mining. In
this paper we study frequency-based problems and propose a new type of
reduction that allows us to compare the complexities of the maximal frequent
pattern mining problems in different domains (e.g. graphs or sequences). Our
results extend those of Kimelfeld and Kolaitis [ACM TODS, 2014] to a broader
range of data mining problems. Our results show that, by allowing constraints
in the pattern space, the complexities of many maximal frequent pattern mining
problems collapse. These problems include maximal frequent subgraphs in
labelled graphs, maximal frequent itemsets, and maximal frequent subsequences
with no repetitions. In addition to theoretical interest, our results might
yield more efficient algorithms for the studied problems.Comment: This is an extended version of a paper of the same title to appear in
the Proceedings of the 17th IEEE International Conference on Data Mining
(ICDM'17
Algorithms for Extracting Frequent Episodes in the Process of Temporal Data Mining
An important aspect in the data mining process is the discovery of patterns having a great influence on the studied problem. The purpose of this paper is to study the frequent episodes data mining through the use of parallel pattern discovery algorithms. Parallel pattern discovery algorithms offer better performance and scalability, so they are of a great interest for the data mining research community. In the following, there will be highlighted some parallel and distributed frequent pattern mining algorithms on various platforms and it will also be presented a comparative study of their main features. The study takes into account the new possibilities that arise along with the emerging novel Compute Unified Device Architecture from the latest generation of graphics processing units. Based on their high performance, low cost and the increasing number of features offered, GPU processors are viable solutions for an optimal implementation of frequent pattern mining algorithmsFrequent Pattern Mining, Parallel Computing, Dynamic Load Balancing, Temporal Data Mining, CUDA, GPU, Fermi, Thread
Reframing in Frequent Pattern Mining
Mining frequent patterns is a crucial task in data mining. Most of the existing frequent pattern mining methods find the complete set of frequent patterns from a given dataset. However, in real-life scenarios we often need to predict the future frequent patterns for different tasks such as business policy making, web page recommendation, stock-market behavior and road traffic analysis. Predicting future frequent patterns from the currently available set of frequent patterns is challenging due to dataset shift where data distributions may change from one dataset to another. In this paper, we propose a new approach called reframing in frequent pattern mining to solve this task. Moreover, we experimentally show the existence of dataset shift in two real-life transactional datasets and the capability of our approach to handle these unknown shifts
- …
