138,441 research outputs found

    Visualizing association rules in hierarchical groups

    Get PDF
    Association rule mining is one of the most popular data mining methods. However, mining association rules often results in a very large number of found rules, leaving the analyst with the task to go through all the rules and discover interesting ones. Sifting manually through large sets of rules is time consuming and strenuous. Although visualization has a long history of making large amounts of data better accessible using techniques like selecting and zooming, most association rule visualization techniques are still falling short when it comes to large numbers of rules. In this paper we introduce a new interactive visualization method, the grouped matrix representation, which allows to intuitively explore and interpret highly complex scenarios. We demonstrate how the method can be used to analyze large sets of association rules using the R software for statistical computing, and provide examples from the implementation in the R-package arulesViz. (authors' abstract

    Specific Usage of Visual Data Analysis Techniques

    Get PDF
    The visualization techniques are very important tools for data mining processes. They are widely applied in many areas especially in supporting decision making processes. We use visualization tools for rule generation, classification and clustering. The paper presents application of data visualization techniques and tools for generation of association rules, classification and clustering

    Combining the Attribute Oriented Induction and Graph Visualization to Enhancement Association Rules Interpretation

    Get PDF
    The important methods of data mining is large and from these methods is mining of association rule. The miningof association rule gives huge number of the rules. These huge rules make analyst consuming more time when searchingthrough the large rules for finding the interesting rules. One of the solutions for this problem is combing between one of theAssociation rules visualization method and generalization method. Association rules visualization method is graph-basedmethod. Generalization method is Attribute Oriented Induction algorithm (AOI). AOI after combing calls ModifiedAOI because it removes and changes in the steps of the traditional AOI. The graph technique after combing also callsgrouped graph method because it displays the aggregated that results rules from AOI. The results of this paper are ratio ofcompression that gives clarity of visualization. These results provide the ability for test and drill down in the rules orunderstand and roll up

    AssocExplorer: An association rule visualization system for exploratory data analysis

    Get PDF
    10.1145/2339530.2339774Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining1536-153

    Spatio-Temporal Associative Mining for Earthquake Data Distribution in Indonesia

    Get PDF
    Indonesia is a country that has the highest seismically activity in the world. This country has really high earthquake frequency because of it traversed by three plate meeting plate and located in Ring of Fire area. The shaking events from an earthquake are very strong and propagate in all directions, capable of destroying even the strongest civilian buildings, so there is no doubt that there are many victims of human lives. The other facts, earthquake in Indonesia have seismic relation between the provinces. In this paper, we present a new earthquake Spatio-temporal mapping system based on the association confidence value from the result of associative mining process on earthquake data distribution in Indonesia. The system proposed three main functions which are (1) Data Acquisition which taken from four data provider, then preprocess and combine it become one, (2) Associative Mining process to get the rule of association earthquake between provinces in Indonesia, and (3) Earthquake Association Spatio-Temporal Model from the highest confidence value and Visualization. We use data from several earthquake data providers from 1900 until 2018.  To perform our proposed Spatio-temporal earthquake association mapping system, we divided the data to become a 5-year discrete partition. After that, we mining the rule and get the highest confidence value from each period. This confidence value is used for modeling and visualization of our Spatio-temporal mapping system. As a result of this study, we manage to generate earthquake association risk mapping from 13 provinces that had earthquake connectivity between each other. The provinces are Aceh, Sumatera Utara, Bengkulu, East Java, Bali, NTB, NTT, Maluku, North Maluku, Gorontalo, North Sulawesi, Papua dan West Papua

    Multi-threaded Implementation of Association Rule Mining with Visualization of the Pattern Tree

    Get PDF
    Motor Vehicle fatalities per 100,000 population in the United States has been reported to be 10.69% in the year 2012 as per NHTSA (National Highway Traffic Safety Administration). The fatality rate has increased by 0.27% in 2012 compared to the rate in the year 2011. As per the reports, there are many factors involved in increasing the fatality rate drastically such as driving under influence, testing while driving, and various other weather phenomena. Decision makers need to analyze the factors attributing to the increase in an accident rate to take implied measures. Current methods used to perform the data analysis process has to be reformed and optimized to make policies for controlling the high traffic accident rates. This research work is an extension to the data-mining algorithm implementation Most Associated Sequential Pattern (MASP). MASP uses association rule mining approach to mine interesting traffic accident data using a modified version of FP-growth algorithm. Owing to the huge amounts of available traffic accident data, MASP algorithm needs to be further modified to make it more efficient with respect to both space and time. Therefore, we present a parallel implementation to the MASP algorithm. In addition to this, pattern tree and apriori-tid algorithm implementation has been done. The application is designed in C# using .NET Framework and C# Task Parallel Library

    Protein secondary structure prediction using BLAST and relaxed threshold rule induction from coverings

    Get PDF
    Protein structure prediction has always been an important research area in bioinformatics and biochemistry. Despite the recent breakthrough of combining multiple sequence alignment information and artificial intelligence algorithms to predict protein secondary structure, the Q₃ accuracy of various computational prediction methods rarely has exceeded 75%; this status has changed little since 2003 when Rost stated that the currently best methods reach a level around 77% three-state per-residue accuracy. The application of artificial neural network methods to this problem is revolutionary in the sense that those techniques employ the homologues of proteins for training and prediction. In this dissertation, a different approach, RT-RICO (Relaxed Threshold Rule Induction from Coverings), is presented that instead uses association rule mining. This approach still makes use of the fundamental principle that structure is more conserved than sequence. However, rules between each known secondary structure element and its neighboring amino acid residues are established to perform the predictions. This dissertation consists of five research articles that discuss different prediction techniques and detailed rule-generation algorithms. The most recent prediction approach, BLAST-RT-RICO, achieved a Q₃ accuracy score of 89.93% on the standard test dataset RS126 and a Q₃ score of 87.71% on the standard test dataset CB396, an improvement over comparable computational methods. Herein one research article also discusses the results of examining those RT-RICO rules using an existing association rule visualization tool, modified to account for the non-Boolean characterization of protein secondary structure --Abstract, page iv

    Combining Clustering techniques and Formal Concept Analysis to characterize Interestingness Measures

    Full text link
    Formal Concept Analysis "FCA" is a data analysis method which enables to discover hidden knowledge existing in data. A kind of hidden knowledge extracted from data is association rules. Different quality measures were reported in the literature to extract only relevant association rules. Given a dataset, the choice of a good quality measure remains a challenging task for a user. Given a quality measures evaluation matrix according to semantic properties, this paper describes how FCA can highlight quality measures with similar behavior in order to help the user during his choice. The aim of this article is the discovery of Interestingness Measures "IM" clusters, able to validate those found due to the hierarchical and partitioning clustering methods "AHC" and "k-means". Then, based on the theoretical study of sixty one interestingness measures according to nineteen properties, proposed in a recent study, "FCA" describes several groups of measures.Comment: 13 pages, 2 figure
    corecore