7,144 research outputs found
Implementing Graph Pattern Mining for Big Data in the Cloud
With the increasing popularity of various social networking sites, there is an explosive growth in data associated with these, so mining big data has become an important problem in the graph pattern mining research area. Graph mining helps to explore the patterns from networks or databases. Till now various graph mining techniques exist for mining frequent patterns for a graph database which contains relatively small sized graphs. But with the rapid arrival of the era of big data, traditional graph mining approaches have been unable to meet large data analysis needs. In this context, this paper proposes an adaptation to the big graph data mining approach especially in the field of social networks. The proposed approach is based on Hadoop plateform, and improves the efficiency by processing big data in distributed fashion. Again the proposed approach can be adapted to cloud environment which has the merits – load balancing, scalability and efficiency. Experiments have been conducted with real Facebook data set. The approach can be also adapted to dataset larger than experimented data.
DOI: 10.17762/ijritcc2321-8169.150514
The Family of MapReduce and Large Scale Data Processing Systems
In the last two decades, the continuous increase of computational power has
produced an overwhelming flow of data which has called for a paradigm shift in
the computing architecture and large scale data processing mechanisms.
MapReduce is a simple and powerful programming model that enables easy
development of scalable parallel applications to process vast amounts of data
on large clusters of commodity machines. It isolates the application from the
details of running a distributed program such as issues on data distribution,
scheduling and fault tolerance. However, the original implementation of the
MapReduce framework had some limitations that have been tackled by many
research efforts in several followup works after its introduction. This article
provides a comprehensive survey for a family of approaches and mechanisms of
large scale data processing mechanisms that have been implemented based on the
original idea of the MapReduce framework and are currently gaining a lot of
momentum in both research and industrial communities. We also cover a set of
introduced systems that have been implemented to provide declarative
programming interfaces on top of the MapReduce framework. In addition, we
review several large scale data processing systems that resemble some of the
ideas of the MapReduce framework for different purposes and application
scenarios. Finally, we discuss some of the future research directions for
implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author
- …