706 research outputs found
Mining Frequent Itemsets Using Genetic Algorithm
In general frequent itemsets are generated from large data sets by applying
association rule mining algorithms like Apriori, Partition, Pincer-Search,
Incremental, Border algorithm etc., which take too much computer time to
compute all the frequent itemsets. By using Genetic Algorithm (GA) we can
improve the scenario. The major advantage of using GA in the discovery of
frequent itemsets is that they perform global search and its time complexity is
less compared to other algorithms as the genetic algorithm is based on the
greedy approach. The main aim of this paper is to find all the frequent
itemsets from given data sets using genetic algorithm
Discovering Unexpected Patterns in Temporal Data Using Temporal Logic
There has been much attention given recently to the task
of finding interesting patterns in temporal databases. Since there are so
many different approaches to the problem of discovering temporal patterns,
we first present a characterization of different discovery tasks and
then focus on one task of discovering interesting patterns of events in
temporal sequences. Given an (infinite) temporal database or a sequence
of events one can, in general, discover an infinite number of temporal
patterns in this data. Therefore, it is important to specify some measure
of interestingness for discovered patterns and then select only the patterns
interesting according to this measure. We present a probabilistic
measure of interestingness based on unexpectedness, whereby a pattern P
is deemed interesting if the ratio of the actual number of occurrences of
P exceeds the expected number of occurrences of P by some user defined
threshold. We then make use of a subset of the propositional, linear temporal
logic and present an efficient algorithm that discovers unexpected
patterns in temporal data. Finally, we apply this algorithm to synthetic
data, UNIX operating system calls, and Web logfiles and present the
results of these experiments.Information Systems Working Papers Serie
Detection of Interesting Traffic Accident Patterns by Association Rule Mining
In recent years, the accident rate related to traffic is high. Analyzing the crash data and extracting useful information from it can help in taking respective measures to decrease this rate or prevent the crash from happening. Related research has been done in the past which involved proposing various measures and algorithms to obtain interesting crash patterns from the crash records. The main problem is that large numbers of patterns were produced and vast number of these patterns would be obvious or not interesting. A deeper analysis of the data is required in order to get the interesting patterns. In order to overcome this situation, we have proposed a new approach to detect the most associated sequential patterns in the crash data. We also make use of the technique, “Association Rule Mining” to mine interesting traffic accident patterns from the crash records. The main goal of this research is to detect the most associated sequential patterns (MASP) and mine patterns within the data sets generated by MASP using a modified FP-growth approach in regular association rule mining. We have designed and implemented data structures for efficient implementation of algorithms. The results extracted can be further queried for pattern analysis to get a deeper understanding. Efficient memory management is one of the main objectives during the implementation of the algorithms. Linked list based tree structures have been used for searching the patterns. The results obtained seemed to be very promising and the detected MASPs contained most of the attributes which gave a deeper insight into the crash data and the patterns were found to be very interesting. A prototype application is developed in C# .NET
A novel MapReduce Lift association rule mining algorithm (MRLAR) for Big Data
Big Data mining is an analytic process used to dis-cover the hidden knowledge and patterns from a massive, com-plex, and multi-dimensional dataset. Single-processor's memory and CPU resources are very limited, which makes the algorithm performance ineffective. Recently, there has been renewed inter-est in using association rule mining (ARM) in Big Data to uncov-er relationships between what seems to be unrelated. However, the traditional discovery ARM techniques are unable to handle this huge amount of data. Therefore, there is a vital need to scal-able and parallel strategies for ARM based on Big Data ap-proaches. This paper develops a novel MapReduce framework for an association rule algorithm based on Lift interestingness measurement (MRLAR) which can handle massive datasets with a large number of nodes. The experimental result shows the effi-ciency of the proposed algorithm to measure the correlations between itemsets through integrating the uses of MapReduce and LIM instead of depending on confidence.Web of Science7315715
- …