1,152 research outputs found
Clustering and Community Detection in Directed Networks: A Survey
Networks (or graphs) appear as dominant structures in diverse domains,
including sociology, biology, neuroscience and computer science. In most of the
aforementioned cases graphs are directed - in the sense that there is
directionality on the edges, making the semantics of the edges non symmetric.
An interesting feature that real networks present is the clustering or
community structure property, under which the graph topology is organized into
modules commonly called communities or clusters. The essence here is that nodes
of the same community are highly similar while on the contrary, nodes across
communities present low similarity. Revealing the underlying community
structure of directed complex networks has become a crucial and
interdisciplinary topic with a plethora of applications. Therefore, naturally
there is a recent wealth of research production in the area of mining directed
graphs - with clustering being the primary method and tool for community
detection and evaluation. The goal of this paper is to offer an in-depth review
of the methods presented so far for clustering directed networks along with the
relevant necessary methodological background and also related applications. The
survey commences by offering a concise review of the fundamental concepts and
methodological base on which graph clustering algorithms capitalize on. Then we
present the relevant work along two orthogonal classifications. The first one
is mostly concerned with the methodological principles of the clustering
algorithms, while the second one approaches the methods from the viewpoint
regarding the properties of a good cluster in a directed network. Further, we
present methods and metrics for evaluating graph clustering results,
demonstrate interesting application domains and provide promising future
research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear
Network as a computer: ranking paths to find flows
We explore a simple mathematical model of network computation, based on
Markov chains. Similar models apply to a broad range of computational
phenomena, arising in networks of computers, as well as in genetic, and neural
nets, in social networks, and so on. The main problem of interaction with such
spontaneously evolving computational systems is that the data are not uniformly
structured. An interesting approach is to try to extract the semantical content
of the data from their distribution among the nodes. A concept is then
identified by finding the community of nodes that share it. The task of data
structuring is thus reduced to the task of finding the network communities, as
groups of nodes that together perform some non-local data processing. Towards
this goal, we extend the ranking methods from nodes to paths. This allows us to
extract some information about the likely flow biases from the available static
information about the network.Comment: 12 pages, CSR 200
Local Ranking Problem on the BrowseGraph
The "Local Ranking Problem" (LRP) is related to the computation of a
centrality-like rank on a local graph, where the scores of the nodes could
significantly differ from the ones computed on the global graph. Previous work
has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a
graph where nodes are webpages and edges are browsing transitions. Recently,
this graph has received more and more attention in many different tasks such as
ranking, prediction and recommendation. However, a web-server has only the
browsing traffic performed on its pages (local BrowseGraph) and, as a
consequence, the local computation can lead to estimation errors, which hinders
the increasing number of applications in the state of the art. Also, although
the divergence between the local and global ranks has been measured, the
possibility of estimating such divergence using only local knowledge has been
mainly overlooked. These aspects are of great interest for online service
providers who want to: (i) gauge their ability to correctly assess the
importance of their resources only based on their local knowledge, and (ii)
take into account real user browsing fluxes that better capture the actual user
interest than the static hyperlink network. We study the LRP problem on a
BrowseGraph from a large news provider, considering as subgraphs the
aggregations of browsing traces of users coming from different domains. We show
that the distance between rankings can be accurately predicted based only on
structural information of the local graph, being able to achieve an average
rank correlation as high as 0.8
- …