Search CORE

635 research outputs found

Comparison of different algorithms for exploting the hidden trends in data sources

Author: Özsevim Emrah
Publication venue: Izmir Institute of Technology
Publication date: 01/01/2003
Field of study

Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2003Includes bibliographical references (leaves: 92-97)Text in English; Abstract: Turkish and English97 leavesThe growth of large-scale transactional databases, time-series databases and other kinds of databases has been giving rise to the development of several efficient algorithms that cope with the computationally expensive task of association rule mining.In this study, different algorithms, Apriori, FP-tree and CHARM, for exploiting the hidden trends such as frequent itemsets, frequent patterns, closed frequent itemsets respectively, were discussed and their performances were evaluated. The perfomances of the algorithms were measured at different support levels, and the algorithms were tested on different data sets (on both synthetic and real data sets). The algorihms were compared according to their, data preparation performances, mining performance, run time performances and knowledge extraction capabilities.The Apriori algorithm is the most prevalent algorithm of association rule mining which makes multiple passes over the database aiming at finding the set of frequent itemsets for each level. The FP-Tree algorithm is a scalable algorithm which finds the crucial information as regards the complete set of prefix paths, conditional pattern bases and frequent patterns by using a compact FP-Tree based mining method. The CHARM is a novel algorithm which brings remarkable improvements over existing association rule mining algorithms by proving the fact that mining the set of closed frequent itemsets is adequate instead of mining the set of all frequent itemsets.Related to our experimental results, we conclude that the Apriori algorithm demonstrates a good performance on sparse data sets. The Fp-tree algorithm extracts less association in comparison to Apriori, however it is completelty a feasable solution that facilitates mining dense data sets at low support levels. On the other hand, the CHARM algorithm is an appropriate algorithm for mining closed frequent itemsets (a substantial portion of frequent itemsets) on both sparse and dense data sets even at low levels of support

HybridMiner: Mining Maximal Frequent Itemsets Using Hybrid Database Representation Approach

Author: Baig Abdul Rauf
Bashir Shariq
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/04/2009
Field of study

In this paper we present a novel hybrid (arraybased layout and vertical bitmap layout) database representation approach for mining complete Maximal Frequent Itemset (MFI) on sparse and large datasets. Our work is novel in terms of scalability, item search order and two horizontal and vertical projection techniques. We also present a maximal algorithm using this hybrid database representation approach. Different experimental results on real and sparse benchmark datasets show that our approach is better than previous state of art maximal algorithms.Comment: 8 Pages In the proceedings of 9th IEEE-INMIC 2005, Karachi, Pakistan, 200

arXiv.org e-Print Archive

Crossref

A Survey on Discovering High Utility Itemset Mining from Transactional Database

Author: Madhushree B
Patel Shekhar
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 02/01/2016
Field of study

Data Mining is the process of evaluating data from different outlooks and summarizing it into useful information. It can be defined as the process that extracts information contained in very large database. Traditional Data mining methods have been focused on to finding a correlation between items which are frequently appearing in the database. And relative importance of each item is not consider in frequent pattern mining. High utility mining is an area research where utility based mining can be done. Mining high utility itemset from a transactional database refers to the discovery of itemset with high utility in a terms like weight, unit profit or value. In this paper we present literature survey of currently used algorithms for high utility itemset mining. Keywords: High utility, Transactional Database, HUI_Miner, FH

CiteSeerX

International Institute for Science, Technology and Education (IISTE): E-Journals

Privacy Preserving Utility Mining: A Survey

Author: Chao Han-Chieh
Gan Wensheng
Lin Jerry Chun-Wei
Wang Shyue-Liang
Yu Philip S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/11/2018
Field of study

In big data era, the collected data usually contains rich information and hidden knowledge. Utility-oriented pattern mining and analytics have shown a powerful ability to explore these ubiquitous data, which may be collected from various fields and applications, such as market basket analysis, retail, click-stream analysis, medical analysis, and bioinformatics. However, analysis of these data with sensitive private information raises privacy concerns. To achieve better trade-off between utility maximizing and privacy preserving, Privacy-Preserving Utility Mining (PPUM) has become a critical issue in recent years. In this paper, we provide a comprehensive overview of PPUM. We first present the background of utility mining, privacy-preserving data mining and PPUM, then introduce the related preliminaries and problem formulation of PPUM, as well as some key evaluation criteria for PPUM. In particular, we present and discuss the current state-of-the-art PPUM algorithms, as well as their advantages and deficiencies in detail. Finally, we highlight and discuss some technical challenges and open directions for future research on PPUM.Comment: 2018 IEEE International Conference on Big Data, 10 page

arXiv.org e-Print Archive

Crossref

Evolving temporal association rules with genetic algorithms

Author: A. Ghandar
C.-Y. Chang
F. Herrera
J. Alcala-Fdez
J.H. Holland
K.A. Jong De
P.-N Tan
R. Agrawal
R. Agrawal
S. Laxman
X. Yan
Y. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

A novel framework for mining temporal association rules by discovering itemsets with a genetic algorithm is introduced. Metaheuristics have been applied to association rule mining, we show the efficacy of extending this to another variant - temporal association rule mining. Our framework is an enhancement to existing temporal association rule mining methods as it employs a genetic algorithm to simultaneously search the rule space and temporal space. A methodology for validating the ability of the proposed framework isolates target temporal itemsets in synthetic datasets. The Iterative Rule Learning method successfully discovers these targets in datasets with varying levels of difficulty

CiteSeerX

Crossref

Sheffield Hallam University Research Archive

Open Repository and Bibliography - Liège

Explore Bristol Research