19,501 research outputs found

    Class Association Rules Mining based Rough Set Method

    Full text link
    This paper investigates the mining of class association rules with rough set approach. In data mining, an association occurs between two set of elements when one element set happen together with another. A class association rule set (CARs) is a subset of association rules with classes specified as their consequences. We present an efficient algorithm for mining the finest class rule set inspired form Apriori algorithm, where the support and confidence are computed based on the elementary set of lower approximation included in the property of rough set theory. Our proposed approach has been shown very effective, where the rough set approach for class association discovery is much simpler than the classic association method.Comment: 10 pages, 2 figure

    Temporal fuzzy association rule mining with 2-tuple linguistic representation

    Get PDF
    This paper reports on an approach that contributes towards the problem of discovering fuzzy association rules that exhibit a temporal pattern. The novel application of the 2-tuple linguistic representation identifies fuzzy association rules in a temporal context, whilst maintaining the interpretability of linguistic terms. Iterative Rule Learning (IRL) with a Genetic Algorithm (GA) simultaneously induces rules and tunes the membership functions. The discovered rules were compared with those from a traditional method of discovering fuzzy association rules and results demonstrate how the traditional method can loose information because rules occur at the intersection of membership function boundaries. New information can be mined from the proposed approach by improving upon rules discovered with the traditional method and by discovering new rules

    A Model-Based Frequency Constraint for Mining Associations from Transaction Data

    Full text link
    Mining frequent itemsets is a popular method for finding associated items in databases. For this method, support, the co-occurrence frequency of the items which form an association, is used as the primary indicator of the associations's significance. A single user-specified support threshold is used to decided if associations should be further investigated. Support has some known problems with rare items, favors shorter itemsets and sometimes produces misleading associations. In this paper we develop a novel model-based frequency constraint as an alternative to a single, user-specified minimum support. The constraint utilizes knowledge of the process generating transaction data by applying a simple stochastic mixture model (the NB model) which allows for transaction data's typically highly skewed item frequency distribution. A user-specified precision threshold is used together with the model to find local frequency thresholds for groups of itemsets. Based on the constraint we develop the notion of NB-frequent itemsets and adapt a mining algorithm to find all NB-frequent itemsets in a database. In experiments with publicly available transaction databases we show that the new constraint provides improvements over a single minimum support threshold and that the precision threshold is more robust and easier to set and interpret by the user

    A fuzzy approach for mining quantitative association rules

    Get PDF
    During the last ten years, data mining, also known as knowledge discovery in databases, has established its position as a prominent and important research area. Mining association rules is one of the important research problems in data mining. Many algorithms have been proposed to find association rules in databases with quantitative attributes. The algorithms usually discretize the attribute domains into sharp intervals, and then apply simpler algorithms developed for boolean attributes. An example of a quantitative association rule might be "10% of married people between age 50 and 70 have at least 2 cars". Recently, fuzzy sets were suggested to represent intervals with non-sharp boundaries. Using the fuzzy concept, the above example could be rephrased e.g. "10% of married old people have several cars". However, if the fuzzy sets are not well chosen, anomalies may occur. In this paper we tackle this problem by introducing an additional fuzzy normalization process. Then we present the definition of quantitative association rules based on fuzzy set theory and propose a new algorithm for mining fuzzy association rules. The algorithm uses generalized definitions for interest measures. Experimental results show the efficiency of the algorithm for large databases

    Subgroup Discovery: Real-World Applications

    Get PDF
    Subgroup discovery is a data mining technique which extracts interesting rules with respect to a target variable. An important characteristic of this task is the combination of predictive and descriptive induction. In this paper, an overview about subgroup discovery is performed. In addition, di erent real-world applications solved through evolutionary algorithms where the suitability and potential of this type of algorithms for the development of subgroup discovery algorithms are presented

    Data mining in soft computing framework: a survey

    Get PDF
    The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included

    Unexpected rules using a conceptual distance based on fuzzy ontology

    Get PDF
    AbstractOne of the major drawbacks of data mining methods is that they generate a notably large number of rules that are often obvious or useless or, occasionally, out of the user’s interest. To address such drawbacks, we propose in this paper an approach that detects a set of unexpected rules in a discovered association rule set. Generally speaking, the proposed approach investigates the discovered association rules using the user’s domain knowledge, which is represented by a fuzzy domain ontology. Next, we rank the discovered rules according to the conceptual distances of the rules
    • …
    corecore