113,459 research outputs found

    Utilisation d'outils de Visual Data Mining pour l'exploration d'un ensemble de règles d'association

    Get PDF
    International audienceData Mining aims at extracting maximum of knowledge from huge databases. It is realized by an automatic process or by data visual exploration with interactive tools. Automatic data mining extracts all the patterns which match a set of metrics. The limit of such algorithms is the amount of extracted data which can be larger than the initial data volume. In this article, we focus on association rules extraction with Apriori algorithm. After the description of a characterization model of a set of association rules, we propose to explore the results of a Data Mining algorithm with an interactive visual tool. There are two advantages. First it will visualize the results of the algorithms from different points of view (metrics, rules attributes). Then it allows us to select easily inside large set of rules the most relevant ones

    Enhancing predictive crime mapping model using association rule mining for geographical and demographic structure

    Get PDF
    This research project is to enhanced predictive crime mapping model with data mining technique to predict the possible rate of crime occurrence. Few specific objectives are stated in order to achieve the aim of this research project. This project proposed a data mining technique called Association Rule Mining. Basically Association Rule Mining is to investigate the rules according to the predefined parameter. This technique considered useful if it can satisfy both minimum confidence and support. Apriori is a popular algorithm in finding frequent set of items in data and association rule. Dataset of Communities and Crime from UCI Machine Learning Repository is used in order to setup the experiment. 60% of the dataset is used for training to generate association rules by using WEKA. The association rules generated shows the prediction of the rate of crime occurrence. The other 40% of the dataset is used to test generated rules. A simple program of C++ is implemented using Microsoft Visual Studio to test generated rules until accuracy of performance is obtained. At the end of the project, generated rules tested and come out with difference accuracy according to predefined minimum support

    Mining aeronautical data by using visualized driven rules extraction approach

    Get PDF
    International audienceData Mining aims at researching relevant information from a huge volume of data. It can be automatic thanks to algorithms, or manual, for instance by using visual exploration tools. An algorithm finds an exhaustive set of patterns matching specific measures. But, depending on measures thresholds, the volume of extracted information can be greater than the volume of initial data. The second approach is Visual Data Mining which helps the specialist to focus on specific areas of data that may describe interesting patterns. However it is generally limited by the difficulty to tackle a great number of multi dimensional data. In this paper, we propose both methods, by combining the use of algorithms with manual visual data mining. From a scatter plot visualization, an algorithm generates association rules, depending on the visual variables assignments. Thus they have a direct effect on the construction of the found rules. Then we characterize the visualization with the extracted association rules in order to show the involvement of the data in the rules, and then which data can be used for predictions. We illustrate our method on two databases. The first describes one month French air traffic and the second stems from a FAA database about delays and cancellations causes

    Interactive visual exploration of association rules with rule-focusing methodology

    Get PDF
    International audienceOn account of the enormous amounts of rules that can be produced by data mining algorithms, knowledge post-processing is a difficult stage in an association rule discovery process. In order to find relevant knowledge for decision making, the user (a decision maker specialized in the data studied) needs to rummage through the rules. To assist him/her in this task, we here propose the rule-focusing methodology, an interactive methodology for the visual post-processing of association rules. It allows the user to explore large sets of rules freely by focusing his/her attention on limited subsets. This new approach relies on rule interestingness measures, on a visual representation, and on interactive navigation among the rules. We have implemented the rule-focusing methodology in a prototype system called ARVis. It exploits the user's focus to guide the generation of the rules by means of a specific constraint-based rule-mining algorithm

    Visual grouping of association rules by clustering conditional probabilities for categorical data

    Full text link
    We demonstrate the use of a visual data-mining tool for non-technical domain experts within organizations to facilitate the extraction of meaningful information and knowledge from in-house databases. The tool is mainly based on the basic notion of grouping association rules. Association rules are useful in discovering items that are frequently found together. However in many applications, rules with lower frequencies are often interesting for the user. Grouping of association rules is one way to overcome the rare item problem. However some groups of association rules are too large for ease of understanding. In this chapter we propose a method for clustering categorical data based on the conditional probabilities of association rules for data sets with large numbers of attributes. We argue that the proposed method provides non-technical users with a better understanding of discovered patterns in the data set

    OLEMAR: An Online Environment for Mining Association Rules in Multidimensional Data

    Get PDF
    Data warehouses and OLAP (online analytical processing) provide tools to explore and navigate through data cubes in order to extract interesting information under different perspectives and levels of granularity. Nevertheless, OLAP techniques do not allow the identification of relationships, groupings, or exceptions that could hold in a data cube. To that end, we propose to enrich OLAP techniques with data mining facilities to benefit from the capabilities they offer. In this chapter, we propose an online environment for mining association rules in data cubes. Our environment called OLEMAR (online environment for mining association rules), is designed to extract associations from multidimensional data. It allows the extraction of inter-dimensional association rules from data cubes according to a sum-based aggregate measure, a more general indicator than aggregate values provided by the traditional COUNT measure. In our approach, OLAP users are able to drive a mining process guided by a meta-rule, which meets their analysis objectives. In addition, the environment is based on a formalization, which exploits aggregate measures to revisit the definition of the support and the confidence of discovered rules. This formalization also helps evaluate the interestingness of association rules according to two additional quality measures: lift and loevinger. Furthermore, in order to focus on the discovered associations and validate them, we provide a visual representation based on the graphic semiology principles. Such a representation consists in a graphic encoding of frequent patterns and association rules in the same multidimensional space as the one associated with the mined data cube. We have developed our approach as a component in a general online analysis platform called Miningcubes according to an Apriori-like algorithm, which helps extract inter-dimensional association rules directly from materialized multidimensional structures of data. In order to illustrate the effectiveness and the efficiency of our proposal, we analyze a real-life case study about breast cancer data and conduct performance experimentation of the mining process

    Rule mining in maintenance: analysing large knowledge bases

    Get PDF
    Association rule mining is a very powerful tool for extracting knowledge from records contained in industrial databases. A difficulty is that the mining process may result in a huge set of rules that may be difficult to analyse. This problem is often addressed by an a priori filtering of the candidate rules, that does not allow the user to have access to all the potentially interesting knowledge. Another popular solution is visual mining, where visualization techniques allow to browse through the rules. We suggest in this article a different approach: generating a large number of rules as a first step, then drill-down the produced rule base using alternatively semantic analysis (based on a priori knowledge) and objective analysis (based on numerical characteristics of the rules). It will be shown on real industrial examples in the maintenance domain that UML Class Diagrams may provide an efficient support for subjective analysis, the practical management of the rules (display, sorting and filtering) being insured by a classical Spreadsheet

    Visualization of Frequent Itemsets with Nested Circular Layout and Bundling Algorithm

    Get PDF
    International audienceFrequent itemset mining is one of the major data mining issues. Once generated by algorithms, the itemsets can be automatically processed, for instance to extract association rules. They can also be explored with visual tools, in order to analyze the emerging patterns. Graphical itemsets representation is a convenient way to obtain an overview of the global interaction structure. However, when the complexity of the database increases, the network may become unreadable. In this paper, we propose to display itemsets on concentric circles, each one being organized to lower the intricacy of the graph through an optimization process. Thanks to a graph bundling algorithm, we finally obtain a compact representation of a large set of itemsets that is easier to exploit. Colors accumulation and interaction operators facilitate the exploration of the new bundle graph and to illustrate how much an itemset is supported by the data

    Visualisation of Association Rules Based on a Molecular Representation

    Get PDF
    In order to extract interesting knowledge from large amounts of rules produced by the data mining algorithms, visual representations of association rules are increasingly used. These representations can help users to find and to validate interesting knowledge. All techniques proposed for visualisation of rules have been developed to represent an association rule as a whole without paying attention to the relations among the items that make up the antecedent and the consequent and the contribution of each one to the rule. In this paper, we propose a new visualisation representation for association rules that allows the visualisation of the items which make up the antecedent and the consequent, the contribution of each one to the rule, and the correlations between each pair of the antecedent and each pair of consequent

    Pattern Mining and Sense-Making Support for Enhancing the User Experience

    Get PDF
    While data mining techniques such as frequent itemset and sequence mining are well established as powerful pattern discovery tools in domains from science, medicine to business, a detriment is the lack of support for interactive exploration of high numbers of patterns generated with diverse parameter settings and the relationships among the mined patterns. To enhance the user experience, real-time query turnaround times and improved support for interactive mining are desired. There is also an increasing interest in applying data mining solutions for mobile data. Patterns mined over mobile data may enable context-aware applications ranging from automating frequently repeated tasks to providing personalized recommendations. Overall, this dissertation addresses three problems that limit the utility of data mining, namely, (a.) lack of interactive exploration tools for mined patterns, (b.) insufficient support for mining localized patterns, and (c.) high computational mining requirements prohibiting mining of patterns on smaller compute units such as a smartphone. This dissertation develops interactive frameworks for the guided exploration of mined patterns and their relationships. Contributions include the PARAS pre- processing and indexing framework; enabling analysts to gain key insights into rule relationships in a parameter space view due to the compact storage of rules that enables query-time reconstruction of complete rulesets. Contributions also include the visual rule exploration framework FIRE that presents an interactive dual view of the parameter space and the rule space, that together enable enhanced sense-making of rule relationships. This dissertation also supports the online mining of localized association rules computed on data subsets by selectively deploying alternative execution strategies that leverage multidimensional itemset-based data partitioning index. Finally, we designed OLAPH, an on-device context-aware service that learns phone usage patterns over mobile context data such as app usage, location, call and SMS logs to provide device intelligence. Concepts introduced for modeling mobile data as sequences include compressing context logs to intervaled context events, adding generalized time features, and identifying meaningful sequences via filter expressions
    • …
    corecore