2 research outputs found
Categorization of interestingness measures for knowledge extraction
Finding interesting association rules is an important and active research
field in data mining. The algorithms of the Apriori family are based on two
rule extraction measures, support and confidence. Although these two measures
have the virtue of being algorithmically fast, they generate a prohibitive
number of rules most of which are redundant and irrelevant. It is therefore
necessary to use further measures which filter uninteresting rules. Many
synthesis studies were then realized on the interestingness measures according
to several points of view. Different reported studies have been carried out to
identify "good" properties of rule extraction measures and these properties
have been assessed on 61 measures. The purpose of this paper is twofold. First
to extend the number of the measures and properties to be studied, in addition
to the formalization of the properties proposed in the literature. Second, in
the light of this formal study, to categorize the studied measures. This paper
leads then to identify categories of measures in order to help the users to
efficiently select an appropriate measure by choosing one or more measure(s)
during the knowledge extraction process. The properties evaluation on the 61
measures has enabled us to identify 7 classes of measures, classes that we
obtained using two different clustering techniques.Comment: 34 pages, 4 figure
A robustness measure of association rules
International audienceWe propose a formal definition of the robustness of association rules for interestingness measures. It is a central concept in the evaluation of the rules and has only been studied unsatisfactorily up to now. It is crucial because a good rule (according to a given quality measure) might turn out as a very fragile rule with respect to small variations in the data. The robustness measure that we propose here is based on a model we proposed in a previous work. It depends on the selected quality measure, the value taken by the rule and the minimal acceptance threshold chosen by the user. We present a few properties of this robustness, detail its use in practice and show the outcomes of various experiments. Furthermore, we compare our results to classical tools of statistical analysis of association rules. All in all, we present a new perspective on the evaluation of association rules