12 research outputs found

    Learning to use a learned model: A two-stage approach to classification

    No full text
    Association rule-based classifiers have recently emerged as competitive classification systems. However, there are still deficiencies that hinder their performance. One deficiency is the use of rules in the classification stage. Current systems assign classes to new objects based on the best rule applied or on some predefined scoring of multiple rules. In this paper we propose a new technique where the system automatically learns how to use the rules. We achieve this by developing a two-stage classification model. First, we use association rule mining to discover classification rules. Second, we employ another learning algorithm to learn how to use these rules in the prediction process. Our two-stage approach outperforms C4.5 and RIPPER on the UCI datasets in our study, and outperforms other rulelearning methods on more than half the datasets. The versatility of our method is also demonstrated by applying it to text classification, where it equals the performance of the best known systems for this task, SVMs

    Mining Positive and Negative Association Rules: An Approach for Confined Rules

    No full text
    Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i.e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Many other applications would benefit from negative association rules if it was not for the expensive process to discover them. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, and while they were referred to in many publications, very few algorithms to mine them have been proposed to date. In this paper we propose an algorithm that extends the support-confidence framework with a sliding correlation coe#cient threshold. In addition to finding confident positive rules that have a strong correlation, the algorithm discovers negative association rules with strong negative correlation between the antecedents and consequents

    On pruning and tuning rules for associative classifiers

    No full text
    Abstract. The integration of supervised classification and association rules for building classification models is not new. One major advantage is that models are human readable and can be edited. However, it is common knowledge that association rule mining typically yields a sheer number of rules defeating the purpose of a human readable model. Pruning unnecessary rules without jeopardizing the classification accuracy is paramount but very challenging. In this paper we study strategies for classification rule pruning in the case of associative classifiers. 1 Associative Classifiers and their massive model Association rules are typically known as an important and common means for market basket analysis. However, it has been observed that association rules could be used to model relationships between class labels and features from a training set [4]. Therefore, association rules were used to efficiently build a classification model from very large training datasets. Since then, many associative classifiers were proposed mainly differing in the strategies used to select rule

    Text Document Categorization by Term Association

    No full text
    A good text classifier is a classifier that efficiently categorizes large sets of text documents in a reasonable time frame and with an acceptable accuracy, and that provides classification rules that are human readable for possible fine-tuning. If the training of the classifier is also quick, this could become in some application domains a good asset for the classifier. Many techniques and algorithms for automatic text categorization have been devised. According to published literature, some are more accurate than others, and some provide more interpretable classification models than others. However, none can combine all the beneficial properties enumerated above. In this paper, we present a novel approach for automatic text categorization that borrows from market basket analysis techniques using association rule mining in the data-mining field. We focus on two major problems: (1) finding the best term association rules in a textual database by generating and pruning; and (2) using the rules to build a text classifier. Our text categorization method proves to be efficient and effective, and experiments on well-known collections show that the classifier performs well. In addition, training as well as classification are both fast and the generated rules are human readable
    corecore