Search CORE

131,735 research outputs found

Algorithms for multi-level frequent pattern minning

Author: Zheng Xi
Publication venue
Publication date: 01/01/2008
Field of study

Data mining is a database paradigm that is used for the extraction of useful information from huge amounts of data. Amongst the functionality provided by data mining, frequent pattern mining (FPM) has become one of the most popular research areas. Methods such as Apriori and FP-growth have been shown to work efficiently in order to discover useful association rules. However, these methods are usually restricted to a single concept level. Since typical business databases support concept hierarchies that represent the relationships amongst different concept levels, we have to extend the focus to discover frequent patterns in multi-level environments. Unfortunately, not much attention has been paid to this research area. Simply applying the methods from single level frequent mining (SLFPM) several times in sequence does not necessarily work well in multi-level frequent pattern mining (MLFPM). In this thesis, we present two novel algorithms that work efficiently to discover multi-level frequent patterns. Adopting either a top-down or bottom-up approach, our algorithms make great use of the existing fp-tree structure, instead of excessively scanning the raw dataset multiple times, as would be done with a, naive implementation. In addition, we also introduce an algorithm to mine cross level frequent patterns. Experimental results have shown that our new algorithms maintain their performance advantage across a broad spectrum of test environments

Concordia University Research Repository

Mining Interesting Positive and Negative Association Rule Based on Improved Genetic Algorithm (MIPNAR_GA)

Author: Jain Anurag
Jain Susheel
Rai Nikky
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 30/12/2013
Field of study

Association Rule mining is very efficient technique for finding strong relation between correlated data. The correlation of data gives meaning full extraction process. For the mining of positive and negative rules, a variety of algorithms are used such as Apriori algorithm and tree based algorithm. A number of algorithms are wonder performance but produce large number of negative association rule and also suffered from multi-scan problem. The idea of this paper is to eliminate these problems and reduce large number of negative rules. Hence we proposed an improved approach to mine interesting positive and negative rules based on genetic and MLMS algorithm. In this method we used a multi-level multiple support of data table as 0 and 1. The divided process reduces the scanning time of database. The proposed algorithm is a combination of MLMS and genetic algorithm. This paper proposed a new algorithm (MIPNAR_GA) for mining interesting positive and negative rule from frequent and infrequent pattern sets. The algorithm is accomplished in to three phases: a).Extract frequent and infrequent pattern sets by using apriori method b).Efficiently generate positive and negative rule. c).Prune redundant rule by applying interesting measures. The process of rule optimization is performed by genetic algorithm and for evaluation of algorithm conducted the real world dataset such as heart disease data and some standard data used from UCI machine learning repository.Keywords— Association rule mining, negative rule and positive rules, frequent and infrequent pattern set, genetic algorithm

International Institute for Science, Technology and Education (IISTE): E-Journals

An Improved Technique for Multi-Dimensional Constrained Gradient Mining

Author: Elugbadebo O. J.
Folorunso O
Sodiya A. S.
Publication venue: Federal University of Agriculture, Abeokuta (FUNAAB)
Publication date: 26/02/2013
Field of study

Multi-dimensional Constrained Gradient Mining, which is an aspect of data mining, is based on mining constrained frequent gradient pattern pairs with significant difference in their measures in transactional database. Top-k Fp-growth with Gradient Pruning and Top-k Fp-growth with No Gradient Pruning were the two algorithms used for Multi-dimensional Constrained Gradient Mining in previous studies. However, these algorithms have their shortcomings. The first requires construction of Fp-tree before searching through the database and the second algorithm requires searching of database twice in finding frequent pattern pairs. These cause the problems of using large amount of time and memory space, which retrogressively make mining of database cumbersome.  Based on this anomaly, a new algorithm that combines Top-k Fp-growth with Gradient pruning and Top-k Fp-growth with No Gradient pruning is designed to eliminate these drawbacks. The new algorithm called Top-K Fp-growth with support Gradient pruning (SUPGRAP) employs the method of scanning the database once, by searching for the node and all the descendant of the node of every task at each level. The idea is to form projected Multidimensional Database and then find the Multidimensional patterns within the projected databases. The evaluation of the new algorithm shows significant improvement in terms of time and space required over the existing algorithms.  &nbsp

Federal University of Agriculture, Abeokuta: FUNAAB Journal

An efficient parallel method for mining frequent closed sequential patterns

Author: Huynh Bao
Snášel Václav
Vo Bay
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Mining frequent closed sequential pattern (FCSPs) has attracted a great deal of research attention, because it is an important task in sequences mining. In recently, many studies have focused on mining frequent closed sequential patterns because, such patterns have proved to be more efficient and compact than frequent sequential patterns. Information can be fully extracted from frequent closed sequential patterns. In this paper, we propose an efficient parallel approach called parallel dynamic bit vector frequent closed sequential patterns (pDBV-FCSP) using multi-core processor architecture for mining FCSPs from large databases. The pDBV-FCSP divides the search space to reduce the required storage space and performs closure checking of prefix sequences early to reduce execution time for mining frequent closed sequential patterns. This approach overcomes the problems of parallel mining such as overhead of communication, synchronization, and data replication. It also solves the load balance issues of the workload between the processors with a dynamic mechanism that re-distributes the work, when some processes are out of work to minimize the idle CPU time.Web of Science5174021739

Crossref

DSpace at VSB Technical University of Ostrava

A review of associative classification mining

Author: Thabtah Fadi
Publication venue
Publication date: 01/01/2007
Field of study

Associative classification mining is a promising approach in data mining that utilizes the association rule discovery techniques to construct classification systems, also known as associative classifiers. In the last few years, a number of associative classification algorithms have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative classification techniques with regards to the above criteria. Finally, future directions in associative classification, such as incremental learning and mining low-quality data sets, are also highlighted in this paper

CiteSeerX

University of Huddersfield Repository