Search CORE

1,352 research outputs found

Evaluation and optimization of frequent association rule based classification

Author: Izwan Nizal Mohd Shaharanee
Jastini Jamil
Publication venue: 'Penerbit Universiti Kebangsaan Malaysia (UKM Press)'
Publication date: 01/06/2014
Field of study

Deriving useful and interesting rules from a data mining system is an essential and important task. Problems such as the discovery of random and coincidental patterns or patterns with no significant values, and the generation of a large volume of rules from a database commonly occur. Works on sustaining the interestingness of rules generated by data mining algorithms are actively and constantly being examined and developed. In this paper, a systematic way to evaluate the association rules discovered from frequent itemset mining algorithms, combining common data mining and statistical interestingness measures, and outline an appropriated sequence of usage is presented. The experiments are performed using a number of real-world datasets that represent diverse characteristics of data/items, and detailed evaluation of rule sets is provided. Empirical results show that with a proper combination of data mining and statistical analysis, the framework is capable of eliminating a large number of non-significant, redundant and contradictive rules while preserving relatively valuable high accuracy and coverage rules when used in the classification problem. Moreover, the results reveal the important characteristics of mining frequent itemsets, and the impact of confidence measure for the classification task

UKM Journal Article Repository

Statistical strategies for pruning all the uninteresting association rules

Author: Casas Garriga Gemma
Publication venue
Publication date: 01/01/2003
Field of study

We propose a general framework to describe formally the problem of capturing the intensity of implication for association rules through statistical metrics. In this framework we present properties that influence the interestingness of a rule, analyze the conditions that lead a measure to perform a perfect prune at a time, and define a final proper order to sort the surviving rules. We will discuss why none of the currently employed measures can capture objective interestingness, and just the combination of some of them, in a multi-step fashion, can be reliable. In contrast, we propose a new simple modification of the Pearson coefficient that will meet all the necessary requirements. We statistically infer the convenient cut-off threshold for this new metric by empirically describing its distribution function through simulation. Final experiments serve to show the ability of our proposal.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

A model for quality guaranteed resource-aware stream mining

Author: Franke C.
Gaber M.
Karnstedt M.
Publication venue
Publication date: 17/09/2007
Field of study

Portsmouth University Research Portal (Pure)

Interactive visual exploration of association rules with rule-focusing methodology

Author: Blanchard Julien
Briand Henri
Guillet Fabrice
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

International audienceOn account of the enormous amounts of rules that can be produced by data mining algorithms, knowledge post-processing is a difficult stage in an association rule discovery process. In order to find relevant knowledge for decision making, the user (a decision maker specialized in the data studied) needs to rummage through the rules. To assist him/her in this task, we here propose the rule-focusing methodology, an interactive methodology for the visual post-processing of association rules. It allows the user to explore large sets of rules freely by focusing his/her attention on limited subsets. This new approach relies on rule interestingness measures, on a visual representation, and on interactive navigation among the rules. We have implemented the rule-focusing methodology in a prototype system called ARVis. It exploits the user's focus to guide the generation of the rules by means of a specific constraint-based rule-mining algorithm

Mining subjectively interesting patterns in rich data

Author: Deng Junning
Publication venue: Universiteit Gent. Faculteit Ingenieurswetenschappen en Architectuur
Publication date: 01/01/2021
Field of study

Ghent University Academic Bibliography

MO-Miner: A Data Mining Tool Based on Multi-Objective Genetic Algorithms

Author: Gina M. B. de Oliveira
Luiz G. A. Martins
Maria C.
S. Takiguti
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Crossref

Mining Frequent Itemsets Using Genetic Algorithm

Author: Biswas Sushanta
Ghosh Soumadip
Sarkar Debasree
Sarkar Partha Pratim
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/11/2010
Field of study

In general frequent itemsets are generated from large data sets by applying association rule mining algorithms like Apriori, Partition, Pincer-Search, Incremental, Border algorithm etc., which take too much computer time to compute all the frequent itemsets. By using Genetic Algorithm (GA) we can improve the scenario. The major advantage of using GA in the discovery of frequent itemsets is that they perform global search and its time complexity is less compared to other algorithms as the genetic algorithm is based on the greedy approach. The main aim of this paper is to find all the frequent itemsets from given data sets using genetic algorithm

arXiv.org e-Print Archive

CiteSeerX

Crossref

Data mining using rule extraction from Kohonen self-organising maps

Author: Bowerman Chris
Malone James
McGarry Kenneth
Wermter Stefan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/01/2006
Field of study

The Kohonen self-organising feature map (SOM) has several important properties that can be used within the data mining/knowledge discovery and exploratory data analysis process. A key characteristic of the SOM is its topology preserving ability to map a multi-dimensional input into a two-dimensional form. This feature is used for classification and clustering of data. However, a great deal of effort is still required to interpret the cluster boundaries. In this paper we present a technique which can be used to extract propositional IF..THEN type rules from the SOM network’s internal parameters. Such extracted rules can provide a human understandable description of the discovered clusters

Sunderland University Institutional Repository