1,244 research outputs found

    Generalized Markov stability of network communities

    Full text link
    We address the problem of community detection in networks by introducing a general definition of Markov stability, based on the difference between the probability fluxes of a Markov chain on the network at different time scales. The specific implementation of the quality function and the resulting optimal community structure thus become dependent both on the type of Markov process and on the specific Markov times considered. For instance, if we use a natural Markov chain dynamics and discount its stationary distribution -- that is, we take as reference process the dynamics at infinite time -- we obtain the standard formulation of the Markov stability. Notably, the possibility to use finite-time transition probabilities to define the reference process naturally allows detecting communities at different resolutions, without the need to consider a continuous-time Markov chain in the small time limit. The main advantage of our general formulation of Markov stability based on dynamical flows is that we work with lumped Markov chains on network partitions, having the same stationary distribution of the original process. In this way the form of the quality function becomes invariant under partitioning, leading to a self-consistent definition of community structures at different aggregation scales

    Personalized PageRank with Node-dependent Restart

    Get PDF
    Personalized PageRank is an algorithm to classify the improtance of web pages on a user-dependent basis. We introduce two generalizations of Personalized PageRank with node-dependent restart. The first generalization is based on the proportion of visits to nodes before the restart, whereas the second generalization is based on the probability of visited node just before the restart. In the original case of constant restart probability, the two measures coincide. We discuss interesting particular cases of restart probabilities and restart distributions. We show that the both generalizations of Personalized PageRank have an elegant expression connecting the so-called direct and reverse Personalized PageRanks that yield a symmetry property of these Personalized PageRanks

    Clustering and Community Detection in Directed Networks: A Survey

    Full text link
    Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed - in the sense that there is directionality on the edges, making the semantics of the edges non symmetric. An interesting feature that real networks present is the clustering or community structure property, under which the graph topology is organized into modules commonly called communities or clusters. The essence here is that nodes of the same community are highly similar while on the contrary, nodes across communities present low similarity. Revealing the underlying community structure of directed complex networks has become a crucial and interdisciplinary topic with a plethora of applications. Therefore, naturally there is a recent wealth of research production in the area of mining directed graphs - with clustering being the primary method and tool for community detection and evaluation. The goal of this paper is to offer an in-depth review of the methods presented so far for clustering directed networks along with the relevant necessary methodological background and also related applications. The survey commences by offering a concise review of the fundamental concepts and methodological base on which graph clustering algorithms capitalize on. Then we present the relevant work along two orthogonal classifications. The first one is mostly concerned with the methodological principles of the clustering algorithms, while the second one approaches the methods from the viewpoint regarding the properties of a good cluster in a directed network. Further, we present methods and metrics for evaluating graph clustering results, demonstrate interesting application domains and provide promising future research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear

    Dynamics-based centrality for general directed networks

    Full text link
    Determining the relative importance of nodes in directed networks is important in, for example, ranking websites, publications, and sports teams, and for understanding signal flows in systems biology. A prevailing centrality measure in this respect is the PageRank. In this work, we focus on another class of centrality derived from the Laplacian of the network. We extend the Laplacian-based centrality, which has mainly been applied to strongly connected networks, to the case of general directed networks such that we can quantitatively compare arbitrary nodes. Toward this end, we adopt the idea used in the PageRank to introduce global connectivity between all the pairs of nodes with a certain strength. Numerical simulations are carried out on some networks. We also offer interpretations of the Laplacian-based centrality for general directed networks in terms of various dynamical and structural properties of networks. Importantly, the Laplacian-based centrality defined as the stationary density of the continuous-time random walk with random jumps is shown to be equivalent to the absorption probability of the random walk with sinks at each node but without random jumps. Similarly, the proposed centrality represents the importance of nodes in dynamics on the original network supplied with sinks but not with random jumps.Comment: 7 figure

    Ranking algorithms on directed configuration networks

    Get PDF
    This paper studies the distribution of a family of rankings, which includes Google's PageRank, on a directed configuration model. In particular, it is shown that the distribution of the rank of a randomly chosen node in the graph converges in distribution to a finite random variable R∗\mathcal{R}^* that can be written as a linear combination of i.i.d. copies of the endogenous solution to a stochastic fixed point equation of the form R=D∑i=1NCiRi+Q,\mathcal{R} \stackrel{\mathcal{D}}{=} \sum_{i=1}^{\mathcal{N}} \mathcal{C}_i \mathcal{R}_i + \mathcal{Q}, where (Q,N,{Ci})(\mathcal{Q}, \mathcal{N}, \{ \mathcal{C}_i\}) is a real-valued vector with N∈{0,1,2,… }\mathcal{N} \in \{0,1,2,\dots\}, P(∣Q∣>0)>0P(|\mathcal{Q}| > 0) > 0, and the {Ri}\{\mathcal{R}_i\} are i.i.d. copies of R\mathcal{R}, independent of (Q,N,{Ci})(\mathcal{Q}, \mathcal{N}, \{ \mathcal{C}_i\}). Moreover, we provide precise asymptotics for the limit R∗\mathcal{R}^*, which when the in-degree distribution in the directed configuration model has a power law imply a power law distribution for R∗\mathcal{R}^* with the same exponent
    • …
    corecore