2 research outputs found
SemGrAM - Integrating semantic graphs into association rule mining
To date, most association rule mining algorithms
have assumed that the domains of items are either
discrete or, in a limited number of cases, hierarchical,
categorical or linear. This constrains the search for
interesting rules to those that satisfy the specified
quality metrics as independent values or as higher
level concepts of those values. However, in many
cases the determination of a single hierarchy is not
practicable and, for many datasets, an item’s value
may be taken from a domain that is more conveniently
structured as a graph with weights indicating
semantic (or conceptual) distance. Research in the
development of algorithms that generate disjunctive
association rules has allowed the production of
rules such as Radios V TVs -> Cables. In many
cases there is little semantic relationship between
the disjunctive terms and arguably less readable
rules such as Radios V Tuesday -> Cables can
result. This paper describes two association rule
mining algorithms, SemGrAMG and SemGrAMP,
that accommodate conceptual distance information
contained in a semantic graph. The SemGrAM
algorithms permit the discovery of rules that include
an association between sets of cognate groups of
item values. The paper discusses the algorithms, the
design decisions made during their development and
some experimental results.Sydney, NS
Towards a semantic and statistical selection of association rules
The increasing growth of databases raises an urgent need for more accurate
methods to better understand the stored data. In this scope, association rules
were extensively used for the analysis and the comprehension of huge amounts of
data. However, the number of generated rules is too large to be efficiently
analyzed and explored in any further process. Association rules selection is a
classical topic to address this issue, yet, new innovated approaches are
required in order to provide help to decision makers. Hence, many interesting-
ness measures have been defined to statistically evaluate and filter the
association rules. However, these measures present two major problems. On the
one hand, they do not allow eliminating irrelevant rules, on the other hand,
their abun- dance leads to the heterogeneity of the evaluation results which
leads to confusion in decision making. In this paper, we propose a two-winged
approach to select statistically in- teresting and semantically incomparable
rules. Our statis- tical selection helps discovering interesting association
rules without favoring or excluding any measure. The semantic comparability
helps to decide if the considered association rules are semantically related
i.e comparable. The outcomes of our experiments on real datasets show promising
results in terms of reduction in the number of rules