4 research outputs found

    ARM-AMO: An Efficient Association Rule Mining Algorithm Based on Animal Migration Optimization

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI linkAssociation rule mining (ARM) aims to find out association rules that satisfy predefined minimum support and confidence from a given database. However, in many cases ARM generates extremely large number of association rules, which are impossible for end users to comprehend or validate, thereby limiting the usefulness of data mining results. In this paper, we propose a new mining algorithm based on Animal Migration Optimization (AMO), called ARM-AMO, to reduce the number of association rules. It is based on the idea that rules which are not of high support and unnecessary are deleted from the data. Firstly, Apriori algorithm is applied to generate frequent itemsets and association rules. Then, AMO is used to reduce the number of association rules with a new fitness function that incorporates frequent rules. It is observed from the experiments that, in comparison with the other relevant techniques, ARM-AMO greatly reduces the computational time for frequent item set generation, memory for association rule generation, and the number of rules generated

    ARM-AMO: An Efficient Association Rule Mining Algorithm Based on Animal Migration Optimization

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI linkAssociation rule mining (ARM) aims to find out association rules that satisfy predefined minimum support and confidence from a given database. However, in many cases ARM generates extremely large number of association rules, which are impossible for end users to comprehend or validate, thereby limiting the usefulness of data mining results. In this paper, we propose a new mining algorithm based on Animal Migration Optimization (AMO), called ARM-AMO, to reduce the number of association rules. It is based on the idea that rules which are not of high support and unnecessary are deleted from the data. Firstly, Apriori algorithm is applied to generate frequent itemsets and association rules. Then, AMO is used to reduce the number of association rules with a new fitness function that incorporates frequent rules. It is observed from the experiments that, in comparison with the other relevant techniques, ARM-AMO greatly reduces the computational time for frequent item set generation, memory for association rule generation, and the number of rules generated

    Efficient Algorithms for Mining Colossal Patterns in high Dimensional Databases

    Get PDF
    With the rapid development of information technology and the application of information technology in many areas of life and socio-economy, for many years the information of humanity has been stored in database system is increasing, the accumulation of this data occurs at a burst speed. This huge amount of data is really a valuable source of "resources" because information is a key element in many areas. Data mining has helped users gain valuable insights from huge databases and data warehouses. Data mining has been widely applied in many fields. In the field of data mining, the association rule is used to indicate the association or correlation between the "conditional → consequent" data elements between data elements. Detecting association rules is to detect those relationships within the scope of a given set of data. Association rule was first introduced in 1993 by Agrawal et al. [1] and has become one of the major data mining studies, especially in recent years. Linkage detection has been successfully applied in many socio-economic fields such as trade, health, biology, finance and banking. In association rule, frequent pattern mining is a key and an important task. Frequent pattern mining refers to the patterns that frequently occur in databases. In the last two decades, researchers have proposed many techniques and algorithms for extracting the frequent patterns, in which the downward closure property plays a fundamental role. One of the challenges in pattern mining is the computational costs besides that is the potentially huge number of extracted patterns. In this thesis, we present an overview of the work done for frequent pattern mining, especial colossal pattern mining and develop methods for mining frequent colossal patterns in high dimensional databases that can tackle emerging data processing workloads while coping with larger and larger scales. Firstly, we develop CP (colossal pattern)-tree for efficient storing colosal patterns. Next, we propose CP-Miner algorithm to mine colossal patterns. CP-Miner is based on CP-tree, early pruning transactions and dynamic bit vectors to mine frequent colossal patterns. PCP-Miner, an improved version of CP-miner is also developed to reduce runtime and memory usage. In PCP-Miner, we develop theorems to prune non-colossal patterns in the mining process. We also develop methods for mining colossal with constraints. In our proposal, two case of constraints are developed including pattern constraint and length constraint.With the rapid development of information technology and the application of information technology in many areas of life and socio-economy, for many years the information of humanity has been stored in database system is increasing, the accumulation of this data occurs at a burst speed. This huge amount of data is really a valuable source of "resources" because information is a key element in many areas. Data mining has helped users gain valuable insights from huge databases and data warehouses. Data mining has been widely applied in many fields. In the field of data mining, the association rule is used to indicate the association or correlation between the "conditional → consequent" data elements between data elements. Detecting association rules is to detect those relationships within the scope of a given set of data. Association rule was first introduced in 1993 by Agrawal et al. [1] and has become one of the major data mining studies, especially in recent years. Linkage detection has been successfully applied in many socio-economic fields such as trade, health, biology, finance and banking. In association rule, frequent pattern mining is a key and an important task. Frequent pattern mining refers to the patterns that frequently occur in databases. In the last two decades, researchers have proposed many techniques and algorithms for extracting the frequent patterns, in which the downward closure property plays a fundamental role. One of the challenges in pattern mining is the computational costs besides that is the potentially huge number of extracted patterns. In this thesis, we present an overview of the work done for frequent pattern mining, especial colossal pattern mining and develop methods for mining frequent colossal patterns in high dimensional databases that can tackle emerging data processing workloads while coping with larger and larger scales. Firstly, we develop CP (colossal pattern)-tree for efficient storing colosal patterns. Next, we propose CP-Miner algorithm to mine colossal patterns. CP-Miner is based on CP-tree, early pruning transactions and dynamic bit vectors to mine frequent colossal patterns. PCP-Miner, an improved version of CP-miner is also developed to reduce runtime and memory usage. In PCP-Miner, we develop theorems to prune non-colossal patterns in the mining process. We also develop methods for mining colossal with constraints. In our proposal, two case of constraints are developed including pattern constraint and length constraint.460 - Katedra informatikyvyhově

    Efficient algorithms for mining colossal patterns in high dimensional databases

    No full text
    Mining association rules plays an important role in decision support systems. To mine strong association rules, it is necessary to mine frequent patterns. There are many algorithms that have been developed to efficiently mine frequent patterns, such as Apriori, Eclat, FP-Growth, PrePost, and FIN. However, these are only efficient with a small number of items in the database. When a database has a large number of items (from thousands to hundreds of thousands) but the number of transactions is small, these algorithms cannot run when the minimum support threshold is also small (because the search space is huge). This thus causes the problem of mining colossal patterns in high dimensional databases. In 2012, Sohrabi and Barforoush proposed the BVBUC algorithm for training colossal patterns based on a bottom up scheme. However, this needs more time to check subsets and supersets, because it generates a lot of candidates and consumes more memory to store these. In this paper we propose new, efficient algorithms for mining colossal patterns. Firstly, the CP (Colossal Pattern)-tree is designed. Next, we develop two theorems to rapidly compute patterns of nodes and prune nodes without the loss of information in colossal patterns. Based on the CP-tree and these theorems, an algorithm (named CP-Miner) is proposed to solve the problem of mining colossal patterns. A Sorting strategy for efficiently mining colossal patterns is thus developed. This strategy helps to reduce the number of significant candidates and the time needed to check subsets and supersets. The PCP-Miner algorithm, which Uses this strategy, is then proposed, and we also conduct experiments to show the efficiency of these algorithms.Web of Science122897
    corecore