Search CORE

706 research outputs found

Mining Frequent Itemsets Using Genetic Algorithm

Author: Biswas Sushanta
Ghosh Soumadip
Sarkar Debasree
Sarkar Partha Pratim
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/11/2010
Field of study

In general frequent itemsets are generated from large data sets by applying association rule mining algorithms like Apriori, Partition, Pincer-Search, Incremental, Border algorithm etc., which take too much computer time to compute all the frequent itemsets. By using Genetic Algorithm (GA) we can improve the scenario. The major advantage of using GA in the discovery of frequent itemsets is that they perform global search and its time complexity is less compared to other algorithms as the genetic algorithm is based on the greedy approach. The main aim of this paper is to find all the frequent itemsets from given data sets using genetic algorithm

arXiv.org e-Print Archive

CiteSeerX

Crossref

Discovering Unexpected Patterns in Temporal Data Using Temporal Logic

Author: Berger Gideon
Tuzhilin Alexander
Publication venue: Stern School of Business, New York University
Publication date: 01/01/1998
Field of study

There has been much attention given recently to the task of finding interesting patterns in temporal databases. Since there are so many different approaches to the problem of discovering temporal patterns, we first present a characterization of different discovery tasks and then focus on one task of discovering interesting patterns of events in temporal sequences. Given an (infinite) temporal database or a sequence of events one can, in general, discover an infinite number of temporal patterns in this data. Therefore, it is important to specify some measure of interestingness for discovered patterns and then select only the patterns interesting according to this measure. We present a probabilistic measure of interestingness based on unexpectedness, whereby a pattern P is deemed interesting if the ratio of the actual number of occurrences of P exceeds the expected number of occurrences of P by some user defined threshold. We then make use of a subset of the propositional, linear temporal logic and present an efficient algorithm that discovers unexpected patterns in temporal data. Finally, we apply this algorithm to synthetic data, UNIX operating system calls, and Web logfiles and present the results of these experiments.Information Systems Working Papers Serie

New York University Faculty Digital Archive

Detection of Interesting Traffic Accident Patterns by Association Rule Mining

Author: Donepudi Harisha
Publication venue: LSU Digital Commons
Publication date: 01/01/2013
Field of study

In recent years, the accident rate related to traffic is high. Analyzing the crash data and extracting useful information from it can help in taking respective measures to decrease this rate or prevent the crash from happening. Related research has been done in the past which involved proposing various measures and algorithms to obtain interesting crash patterns from the crash records. The main problem is that large numbers of patterns were produced and vast number of these patterns would be obvious or not interesting. A deeper analysis of the data is required in order to get the interesting patterns. In order to overcome this situation, we have proposed a new approach to detect the most associated sequential patterns in the crash data. We also make use of the technique, “Association Rule Mining” to mine interesting traffic accident patterns from the crash records. The main goal of this research is to detect the most associated sequential patterns (MASP) and mine patterns within the data sets generated by MASP using a modified FP-growth approach in regular association rule mining. We have designed and implemented data structures for efficient implementation of algorithms. The results extracted can be further queried for pattern analysis to get a deeper understanding. Efficient memory management is one of the main objectives during the implementation of the algorithms. Linked list based tree structures have been used for searching the patterns. The results obtained seemed to be very promising and the detected MASPs contained most of the attributes which gave a deeper insight into the crash data and the patterns were found to be very interesting. A prototype application is developed in C# .NET

Louisiana State University

A novel MapReduce Lift association rule mining algorithm (MRLAR) for Big Data

Author: Fouad Mohamed Mostafa
Owais Suhail Sami Jebour
Oweis Nour Easa
Oweis Sami R.
Snášel Václav
Publication venue: 'The Science and Information Organization'
Publication date: 01/01/2016
Field of study

Big Data mining is an analytic process used to dis-cover the hidden knowledge and patterns from a massive, com-plex, and multi-dimensional dataset. Single-processor's memory and CPU resources are very limited, which makes the algorithm performance ineffective. Recently, there has been renewed inter-est in using association rule mining (ARM) in Big Data to uncov-er relationships between what seems to be unrelated. However, the traditional discovery ARM techniques are unable to handle this huge amount of data. Therefore, there is a vital need to scal-able and parallel strategies for ARM based on Big Data ap-proaches. This paper develops a novel MapReduce framework for an association rule algorithm based on Lift interestingness measurement (MRLAR) which can handle massive datasets with a large number of nodes. The experimental result shows the effi-ciency of the proposed algorithm to measure the correlations between itemsets through integrating the uses of MapReduce and LIM instead of depending on confidence.Web of Science7315715

Crossref

DSpace at VSB Technical University of Ostrava