291 research outputs found
Clustering and Community Detection in Directed Networks: A Survey
Networks (or graphs) appear as dominant structures in diverse domains,
including sociology, biology, neuroscience and computer science. In most of the
aforementioned cases graphs are directed - in the sense that there is
directionality on the edges, making the semantics of the edges non symmetric.
An interesting feature that real networks present is the clustering or
community structure property, under which the graph topology is organized into
modules commonly called communities or clusters. The essence here is that nodes
of the same community are highly similar while on the contrary, nodes across
communities present low similarity. Revealing the underlying community
structure of directed complex networks has become a crucial and
interdisciplinary topic with a plethora of applications. Therefore, naturally
there is a recent wealth of research production in the area of mining directed
graphs - with clustering being the primary method and tool for community
detection and evaluation. The goal of this paper is to offer an in-depth review
of the methods presented so far for clustering directed networks along with the
relevant necessary methodological background and also related applications. The
survey commences by offering a concise review of the fundamental concepts and
methodological base on which graph clustering algorithms capitalize on. Then we
present the relevant work along two orthogonal classifications. The first one
is mostly concerned with the methodological principles of the clustering
algorithms, while the second one approaches the methods from the viewpoint
regarding the properties of a good cluster in a directed network. Further, we
present methods and metrics for evaluating graph clustering results,
demonstrate interesting application domains and provide promising future
research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear
Discovering Communities of Community Discovery
Discovering communities in complex networks means grouping nodes similar to
each other, to uncover latent information about them. There are hundreds of
different algorithms to solve the community detection task, each with its own
understanding and definition of what a "community" is. Dozens of review works
attempt to order such a diverse landscape -- classifying community discovery
algorithms by the process they employ to detect communities, by their
explicitly stated definition of community, or by their performance on a
standardized task. In this paper, we classify community discovery algorithms
according to a fourth criterion: the similarity of their results. We create an
Algorithm Similarity Network (ASN), whose nodes are the community detection
approaches, connected if they return similar groupings. We then perform
community detection on this network, grouping algorithms that consistently
return the same partitions or overlapping coverage over a span of more than one
thousand synthetic and real world networks. This paper is an attempt to create
a similarity-based classification of community detection algorithms based on
empirical data. It improves over the state of the art by comparing more than
seventy approaches, discovering that the ASN contains well-separated groups,
making it a sensible tool for practitioners, aiding their choice of algorithms
fitting their analytic needs
Generating Groups of Products Using Graph Mining Techniques
AbstractRetail industry has evolved. Nowadays, companies around the world need a better and deeper understanding of their customers. In order to enhance store layout, generate customers groups, offers and personalized recommendations, among others. To accomplish these objectives, it is very important to know which products are related to each other.Classical approaches for clustering products, such as K-means or SOFM, do not work when exist scattered and large amounts of data. Even association rules give results that are difficult to interpret. These facts motivate us to use a novel approach that generates communities of products. One of the main advantages of these communities is that are meaningful and easily interpretable by retail analysts. This approach allows the processing of billions of transaction records within a reasonable time, according to the needs of companies
Large network community detection by fast label propagation
Many networks exhibit some community structure. There exists a wide variety
of approaches to detect communities in networks, each offering different
interpretations and associated algorithms. For large networks, there is the
additional requirement of speed. In this context, the so-called label
propagation algorithm (LPA) was proposed, which runs in near-linear time. In
partitions uncovered by LPA, each node is ensured to have most links to its
assigned community. We here propose a fast variant of LPA (FLPA) that is based
on processing a queue of nodes whose neighbourhood recently changed. We test
FLPA exhaustively on benchmark networks and empirical networks, finding that it
can run up to 700 times faster than LPA. In partitions found by FLPA, we prove
that each node is again guaranteed to have most links to its assigned
community. Our results show that FLPA is generally preferable to LPA
- …