10 research outputs found

    Ассоциативные правила в интеллектуальном анализе данных

    Get PDF
    Рассмотрена задача построения моделей на основе ассоциативных правил. Проанализирован процесс поиска ассоциативных правил. Исследованы различные виды ассоциативных правил (негативные, численные, обобщенные, временные и нечеткие ассоциативные правила при использовании их для решения задач интеллектуального анализа данныхThe problem of synthesis of models based on association rules is concidered. The process of mining association rules is analyzed. Various types of association rules (negative, quantitative, generalized, temporal and fuzzy association rules) for solving data mining problems are investigate

    Chaotic Rough Particle Swarm Optimization Algorithms

    Get PDF

    Irrelevant feature and rule removal for structural associative classification

    Get PDF
    In the classification task, the presence of irrelevant features can significantly degrade the performance of classification algorithms,in terms of additional processing time, more complex models and the likelihood that the models have poor generalization power due to the over fitting problem.Practical applications of association rule mining often suffer from overwhelming number of rules that are generated, many of which are not interesting or not useful for the application in question.Removing rules comprised of irrelevant features can significantly improve the overall performance.In this paper, we explore and compare the use of a feature selection measure to filter out unnecessary and irrelevant features/attributes prior to association rules generation.The experiments are performed using a number of real-world datasets that represent diverse characteristics of data items.Empirical results confirm that by utilizing feature subset selection prior to association rule generation, a large number of rules with irrelevant features can be eliminated.More importantly, the results reveal that removing rules that hold irrelevant features improve the accuracy rate and capability to retain the rule coverage rate of structural associative association

    AMIC:An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data

    Get PDF

    Quality and interestingness of association rules derived from data mining of relational and semi-structured data

    Get PDF
    Deriving useful and interesting rules from a data mining system are essential and important tasks. Problems such as the discovery of random and coincidental patterns or patterns with no significant values, and the generation of a large volume of rules from a database commonly occur. Works on sustaining the interestingness of rules generated by data mining algorithms are actively and constantly being examined and developed. As the data mining techniques are data-driven, it is beneficial to affirm the rules using a statistical approach. It is important to establish the ways in which the existing statistical measures and constraint parameters can be effectively utilized and the sequence of their usage.In this thesis, a systematic way to evaluate the association rules discovered from frequent, closed and maximal itemset mining algorithms; and frequent subtree mining algorithm including the rules based on induced, embedded and disconnected subtrees is presented. With reference to the frequent subtree mining, in addition a new direction is explored based on utilizing the DSM approach capable of preserving all information from tree-structured database in a flat data format, consequently enabling the direct application of a wider range of data mining analysis/techniques to tree-structured data. Implications of this approach were investigated and it was found that basing rules on disconnected subtrees, can be useful in terms of increasing the accuracy and the coverage rate of the rule set.A strategy that combines data mining and statistical measurement techniques such as sampling, redundancy and contradictive checks, correlation and regression analysis to evaluate the rules is developed. This framework is then applied to real-world datasets that represent diverse characteristics of data/items. Empirical results show that with a proper combination of data mining and statistical analysis, the proposed framework is capable of eliminating a large number of non-significant, redundant and contradictive rules while preserving relatively valuable high accuracy rules. Moreover, the results reveal the important characteristics and differences between mining frequent, closed or maximal itemsets; and mining frequent subtree including the rules based on induced, embedded and disconnected subtrees; as well as the impact of confidence measure for the prediction and classification task

    An Information-Theoretic Approach to Quantitative Association Rule Mining

    No full text
    Quantitative Association Rule (QAR) mining has been recognized an influential research problem over the last decade due to the popularity of quantitative databases and the usefulness of association rules in real life. Unlike Boolean Association Rules (BARs), which only consider boolean attributes, QARs consist of quantitative attributes which contain much richer information than the boolean attributes. However, the combination of these quantitative attributes and their value intervals always gives rise to the generation of an explosively large number of itemsets, thereby severely degrading the mining efficiency. In this paper, we propose an information-theoretic approach to avoid un-rewarding combinations of both the attributes and their value intervals being generated in the mining process. We study the mutual information between the attributes in a quantitative database and devise a normalization on the mutual information to make it applicable in the context of QAR mining. To indicate the strong informative relationships among th

    MIC framework: An information-theoretic approach to quantitative association rule mining

    No full text
    We propose a framework, called MIC, which adopts an information-theoretic approach to address the problem of quantitative association rule mining. In our MIC framework, we first discretize the quantitative attributes. Then, we compute the normalized mutual information between the attributes to construct a graph that indicates the strong informative-relationship between the attributes. We utilize the cliques in the graph to prune the unpromising attribute sets and hence the joined intervals between these attributes. Our experimental results show that the MIC framework significantly improves the mining speed. Importantly, we are able to obtain most of the high-confidence rules and the missing rules are shown to be less interesting. © 2006 IEEE
    corecore