190 research outputs found

    Z-score-based modularity for community detection in networks

    Full text link
    Identifying community structure in networks is an issue of particular interest in network science. The modularity introduced by Newman and Girvan [Phys. Rev. E 69, 026113 (2004)] is the most popular quality function for community detection in networks. In this study, we identify a problem in the concept of modularity and suggest a solution to overcome this problem. Specifically, we obtain a new quality function for community detection. We refer to the function as Z-modularity because it measures the Z-score of a given division with respect to the fraction of the number of edges within communities. Our theoretical analysis shows that Z-modularity mitigates the resolution limit of the original modularity in certain cases. Computational experiments using both artificial networks and well-known real-world networks demonstrate the validity and reliability of the proposed quality function.Comment: 8 pages, 10 figure

    Robust Densest Subgraph Discovery

    Full text link
    Dense subgraph discovery is an important primitive in graph mining, which has a wide variety of applications in diverse domains. In the densest subgraph problem, given an undirected graph G=(V,E)G=(V,E) with an edge-weight vector w=(we)e∈Ew=(w_e)_{e\in E}, we aim to find SβŠ†VS\subseteq V that maximizes the density, i.e., w(S)/∣S∣w(S)/|S|, where w(S)w(S) is the sum of the weights of the edges in the subgraph induced by SS. Although the densest subgraph problem is one of the most well-studied optimization problems for dense subgraph discovery, there is an implicit strong assumption; it is assumed that the weights of all the edges are known exactly as input. In real-world applications, there are often cases where we have only uncertain information of the edge weights. In this study, we provide a framework for dense subgraph discovery under the uncertainty of edge weights. Specifically, we address such an uncertainty issue using the theory of robust optimization. First, we formulate our fundamental problem, the robust densest subgraph problem, and present a simple algorithm. We then formulate the robust densest subgraph problem with sampling oracle that models dense subgraph discovery using an edge-weight sampling oracle, and present an algorithm with a strong theoretical performance guarantee. Computational experiments using both synthetic graphs and popular real-world graphs demonstrate the effectiveness of our proposed algorithms.Comment: 10 pages; Accepted to ICDM 201

    Additive Approximation Algorithms for Modularity Maximization

    Get PDF
    The modularity is a quality function in community detection, which was introduced by Newman and Girvan (2004). Community detection in graphs is now often conducted through modularity maximization: given an undirected graph G=(V,E)G=(V,E), we are asked to find a partition C\mathcal{C} of VV that maximizes the modularity. Although numerous algorithms have been developed to date, most of them have no theoretical approximation guarantee. Recently, to overcome this issue, the design of modularity maximization algorithms with provable approximation guarantees has attracted significant attention in the computer science community. In this study, we further investigate the approximability of modularity maximization. More specifically, we propose a polynomial-time (cos⁑(3βˆ’54Ο€)βˆ’1+58)\left(\cos\left(\frac{3-\sqrt{5}}{4}\pi\right) - \frac{1+\sqrt{5}}{8}\right)-additive approximation algorithm for the modularity maximization problem. Note here that cos⁑(3βˆ’54Ο€)βˆ’1+58<0.42084\cos\left(\frac{3-\sqrt{5}}{4}\pi\right) - \frac{1+\sqrt{5}}{8} < 0.42084 holds. This improves the current best additive approximation error of 0.46720.4672, which was recently provided by Dinh, Li, and Thai (2015). Interestingly, our analysis also demonstrates that the proposed algorithm obtains a nearly-optimal solution for any instance with a very high modularity value. Moreover, we propose a polynomial-time 0.165980.16598-additive approximation algorithm for the maximum modularity cut problem. It should be noted that this is the first non-trivial approximability result for the problem. Finally, we demonstrate that our approximation algorithm can be extended to some related problems.Comment: 23 pages, 4 figure

    The Densest Subgraph Problem with a Convex/Concave Size Function

    Get PDF
    we propose a linear-programming-based polynomial-time exact algorithm. It should be emphasized that this algorithm obtains not only an optimal solution to the problem but also subsets of vertices corresponding to the extreme points of the upper convex hull of {(|S|, w(S)) | S subseteq V }, which we refer to as the dense frontier points. We also propose a flow-based combinatorial exact algorithm for unweighted graphs that runs in O(n^3) time. Finally, we propose a nearly-linear-time 3-approximation algorithm

    Stochastic Solutions for Dense Subgraph Discovery in Multilayer Networks

    Full text link
    Network analysis has played a key role in knowledge discovery and data mining. In many real-world applications in recent years, we are interested in mining multilayer networks, where we have a number of edge sets called layers, which encode different types of connections and/or time-dependent connections over the same set of vertices. Among many network analysis techniques, dense subgraph discovery, aiming to find a dense component in a network, is an essential primitive with a variety of applications in diverse domains. In this paper, we introduce a novel optimization model for dense subgraph discovery in multilayer networks. Our model aims to find a stochastic solution, i.e., a probability distribution over the family of vertex subsets, rather than a single vertex subset, whereas it can also be used for obtaining a single vertex subset. For our model, we design an LP-based polynomial-time exact algorithm. Moreover, to handle large-scale networks, we also devise a simple, scalable preprocessing algorithm, which often reduces the size of the input networks significantly and results in a substantial speed-up. Computational experiments demonstrate the validity of our model and the effectiveness of our algorithms.Comment: Accepted to WSDM 202


    Get PDF
    A method for dividing a two-dimensional multi-connected region of a complex shape into a set of triangular elements is developed. A region of a complex shape in the physical plane is divided into some simply connected subregions, and each subregion is mapped onto a square region in the transformed plane. The inverse functions of the mapping are determined by the solution to elliptic partial differential equations with the Dirichlet boundary conditions. After the square region is divided into a set of finite elements, each element is inversely mapped onto the subregions by use of the functions. The finite element data for the global region are made of those for the divided subregions
    • …