19,501 research outputs found
Class Association Rules Mining based Rough Set Method
This paper investigates the mining of class association rules with rough set
approach. In data mining, an association occurs between two set of elements
when one element set happen together with another. A class association rule set
(CARs) is a subset of association rules with classes specified as their
consequences. We present an efficient algorithm for mining the finest class
rule set inspired form Apriori algorithm, where the support and confidence are
computed based on the elementary set of lower approximation included in the
property of rough set theory. Our proposed approach has been shown very
effective, where the rough set approach for class association discovery is much
simpler than the classic association method.Comment: 10 pages, 2 figure
Temporal fuzzy association rule mining with 2-tuple linguistic representation
This paper reports on an approach that contributes towards the problem of discovering fuzzy association rules that exhibit a temporal pattern. The novel application of the 2-tuple linguistic representation identifies fuzzy association rules in a temporal context, whilst maintaining the interpretability of linguistic terms. Iterative Rule Learning (IRL) with a Genetic Algorithm (GA) simultaneously induces rules and tunes the membership functions. The discovered rules were compared with those from a traditional method of discovering fuzzy association rules and results demonstrate how the traditional method can loose information because rules occur at the intersection of membership function boundaries. New information can be mined from the proposed approach by improving upon rules discovered with the traditional method and by discovering new rules
A Model-Based Frequency Constraint for Mining Associations from Transaction Data
Mining frequent itemsets is a popular method for finding associated items in
databases. For this method, support, the co-occurrence frequency of the items
which form an association, is used as the primary indicator of the
associations's significance. A single user-specified support threshold is used
to decided if associations should be further investigated. Support has some
known problems with rare items, favors shorter itemsets and sometimes produces
misleading associations.
In this paper we develop a novel model-based frequency constraint as an
alternative to a single, user-specified minimum support. The constraint
utilizes knowledge of the process generating transaction data by applying a
simple stochastic mixture model (the NB model) which allows for transaction
data's typically highly skewed item frequency distribution. A user-specified
precision threshold is used together with the model to find local frequency
thresholds for groups of itemsets. Based on the constraint we develop the
notion of NB-frequent itemsets and adapt a mining algorithm to find all
NB-frequent itemsets in a database. In experiments with publicly available
transaction databases we show that the new constraint provides improvements
over a single minimum support threshold and that the precision threshold is
more robust and easier to set and interpret by the user
A fuzzy approach for mining quantitative association rules
During the last ten years, data mining, also known as knowledge discovery in databases, has established its position as a prominent and important research area. Mining association rules is one of the important research problems in data mining. Many algorithms have been proposed to find association rules in databases with quantitative attributes. The algorithms usually discretize the attribute domains into sharp intervals, and then apply simpler algorithms developed for boolean attributes. An example of a quantitative association rule might be "10% of married people between age 50 and 70 have at least 2 cars". Recently, fuzzy sets were suggested to represent intervals with non-sharp boundaries. Using the fuzzy concept, the above example could be rephrased e.g. "10% of married old people have several cars". However, if the fuzzy sets are not well chosen, anomalies may occur. In this paper we tackle this problem by introducing an additional fuzzy normalization process. Then we present the definition of quantitative association rules based on fuzzy set theory and propose a new algorithm for mining fuzzy association rules. The algorithm uses generalized definitions for interest measures. Experimental results show the efficiency of the algorithm for large databases
Subgroup Discovery: Real-World Applications
Subgroup discovery is a data mining technique which extracts interesting rules with respect
to a target variable. An important characteristic of this task is the combination of predictive
and descriptive induction. In this paper, an overview about subgroup discovery is performed.
In addition, di erent real-world applications solved through evolutionary algorithms where the
suitability and potential of this type of algorithms for the development of subgroup discovery
algorithms are presented
Data mining in soft computing framework: a survey
The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included
Unexpected rules using a conceptual distance based on fuzzy ontology
AbstractOne of the major drawbacks of data mining methods is that they generate a notably large number of rules that are often obvious or useless or, occasionally, out of the user’s interest. To address such drawbacks, we propose in this paper an approach that detects a set of unexpected rules in a discovered association rule set. Generally speaking, the proposed approach investigates the discovered association rules using the user’s domain knowledge, which is represented by a fuzzy domain ontology. Next, we rank the discovered rules according to the conceptual distances of the rules
- …