1 research outputs found
Map / Reduce Deisgn and Implementation of Apriori Alogirthm for handling voluminous data-sets
Apriori is one of the key algorithms to generate frequent itemsets. Analyzing
frequent itemset is a crucial step in analysing structured data and in finding
association relationship between items. This stands as an elementary foundation
to supervised learning, which encompasses classifier and feature extraction
methods. Applying this algorithm is crucial to understand the behaviour of
structured data. Most of the structured data in scientific domain are
voluminous. Processing such kind of data requires state of the art computing
machines. Setting up such an infrastructure is expensive. Hence a distributed
environment such as a clustered setup is employed for tackling such scenarios.
Apache Hadoop distribution is one of the cluster frameworks in distributed
environment that helps by distributing voluminous data across a number of nodes
in the framework. This paper focuses on map/reduce design and implementation of
Apriori algorithm for structured data analysis.Comment: 11 pages, 5 figures; Advanced Computing: An International Journal
(ACIJ), Vol.3, No.6, November 201