6,390 research outputs found
Enumerating Maximal Bicliques from a Large Graph using MapReduce
We consider the enumeration of maximal bipartite cliques (bicliques) from a
large graph, a task central to many practical data mining problems in social
network analysis and bioinformatics. We present novel parallel algorithms for
the MapReduce platform, and an experimental evaluation using Hadoop MapReduce.
Our algorithm is based on clustering the input graph into smaller sized
subgraphs, followed by processing different subgraphs in parallel. Our
algorithm uses two ideas that enable it to scale to large graphs: (1) the
redundancy in work between different subgraph explorations is minimized through
a careful pruning of the search space, and (2) the load on different reducers
is balanced through the use of an appropriate total order among the vertices.
Our evaluation shows that the algorithm scales to large graphs with millions of
edges and tens of mil- lions of maximal bicliques. To our knowledge, this is
the first work on maximal biclique enumeration for graphs of this scale.Comment: A preliminary version of the paper was accepted at the Proceedings of
the 3rd IEEE International Congress on Big Data 201
- …