Search CORE

436 research outputs found

Direct mining of subjectively interesting relational patterns

Author: Aknin Achille
De Bie Tijl
Guns Tias
Lijffijt Jefrey
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Data is typically complex and relational. Therefore, the development of relational data mining methods is an increasingly active topic of research. Recent work has resulted in new formalisations of patterns in relational data and in a way to quantify their interestingness in a subjective manner, taking into account the data analyst's prior beliefs about the data. Yet, a scalable algorithm to find such most interesting patterns is lacking. We introduce a new algorithm based on two notions: (1) the use of Constraint Programming, which results in a notably shorter development time, faster runtimes, and more flexibility for extensions such as branch-and-bound search, and (2), the direct search for the most interesting patterns only, instead of exhaustive enumeration of patterns before ranking them. Through empirical evaluation, we find that our novel bounds yield speedups up to several orders of magnitude, especially on dense data with a simple schema. This makes it possible to mine the most subjectively-interesting relational patterns present in databases where this was previously impractical or impossible

Crossref

Ghent University Academic Bibliography

Discover, recycle and reuse frequent patterns in association rule mining

Author: CONG GAO
Publication venue
Publication date: 17/08/2004
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Optimization of Association Rules Extraction Through Exploitation of Context Dependent Constraints

Author: Botta M.
Esposito R.
Gallo A.
Meo R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Institutional Research Information System University of Turin

Data Mining in Databases: Languages and Indices

Author: Elena Baralis
Meo Rosa
Silvia Chiusano
Tania Cerquitelli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Institutional Research Information System University of Turin

RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework

Author: F Zhang
J Dean
J Han
KK Sethi
KW Chon
MJ Zaki
MJ Zaki
S Rathee
S Singh
Y Xun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/12/2019
Field of study

Initially, a number of frequent itemset mining (FIM) algorithms have been designed on the Hadoop MapReduce, a distributed big data processing framework. But, due to heavy disk I/O, MapReduce is found to be inefficient for such highly iterative algorithms. Therefore, Spark, a more efficient distributed data processing framework, has been developed with in-memory computation and resilient distributed dataset (RDD) features to support the iterative algorithms. On the Spark RDD framework, Apriori and FP-Growth based FIM algorithms have been designed, but Eclat-based algorithm has not been explored yet. In this paper, RDD-Eclat, a parallel Eclat algorithm on the Spark RDD framework is proposed with its five variants. The proposed algorithms are evaluated on the various benchmark datasets, which shows that RDD-Eclat outperforms the Spark-based Apriori by many times. Also, the experimental results show the scalability of the proposed algorithms on increasing the number of cores and size of the dataset.Comment: 16 pages, 6 figures, ICCNCT 201

arXiv.org e-Print Archive

Crossref