190 research outputs found
Z-score-based modularity for community detection in networks
Identifying community structure in networks is an issue of particular
interest in network science. The modularity introduced by Newman and Girvan
[Phys. Rev. E 69, 026113 (2004)] is the most popular quality function for
community detection in networks. In this study, we identify a problem in the
concept of modularity and suggest a solution to overcome this problem.
Specifically, we obtain a new quality function for community detection. We
refer to the function as Z-modularity because it measures the Z-score of a
given division with respect to the fraction of the number of edges within
communities. Our theoretical analysis shows that Z-modularity mitigates the
resolution limit of the original modularity in certain cases. Computational
experiments using both artificial networks and well-known real-world networks
demonstrate the validity and reliability of the proposed quality function.Comment: 8 pages, 10 figure
Robust Densest Subgraph Discovery
Dense subgraph discovery is an important primitive in graph mining, which has
a wide variety of applications in diverse domains. In the densest subgraph
problem, given an undirected graph with an edge-weight vector
, we aim to find that maximizes the density,
i.e., , where is the sum of the weights of the edges in the
subgraph induced by . Although the densest subgraph problem is one of the
most well-studied optimization problems for dense subgraph discovery, there is
an implicit strong assumption; it is assumed that the weights of all the edges
are known exactly as input. In real-world applications, there are often cases
where we have only uncertain information of the edge weights. In this study, we
provide a framework for dense subgraph discovery under the uncertainty of edge
weights. Specifically, we address such an uncertainty issue using the theory of
robust optimization. First, we formulate our fundamental problem, the robust
densest subgraph problem, and present a simple algorithm. We then formulate the
robust densest subgraph problem with sampling oracle that models dense subgraph
discovery using an edge-weight sampling oracle, and present an algorithm with a
strong theoretical performance guarantee. Computational experiments using both
synthetic graphs and popular real-world graphs demonstrate the effectiveness of
our proposed algorithms.Comment: 10 pages; Accepted to ICDM 201
Additive Approximation Algorithms for Modularity Maximization
The modularity is a quality function in community detection, which was
introduced by Newman and Girvan (2004). Community detection in graphs is now
often conducted through modularity maximization: given an undirected graph
, we are asked to find a partition of that maximizes
the modularity. Although numerous algorithms have been developed to date, most
of them have no theoretical approximation guarantee. Recently, to overcome this
issue, the design of modularity maximization algorithms with provable
approximation guarantees has attracted significant attention in the computer
science community.
In this study, we further investigate the approximability of modularity
maximization. More specifically, we propose a polynomial-time
-additive approximation algorithm for the
modularity maximization problem. Note here that
holds. This improves the current best additive approximation error of ,
which was recently provided by Dinh, Li, and Thai (2015). Interestingly, our
analysis also demonstrates that the proposed algorithm obtains a nearly-optimal
solution for any instance with a very high modularity value. Moreover, we
propose a polynomial-time -additive approximation algorithm for the
maximum modularity cut problem. It should be noted that this is the first
non-trivial approximability result for the problem. Finally, we demonstrate
that our approximation algorithm can be extended to some related problems.Comment: 23 pages, 4 figure
The Densest Subgraph Problem with a Convex/Concave Size Function
we propose a linear-programming-based polynomial-time exact algorithm. It should be emphasized that this algorithm obtains not only an optimal solution to the problem but also subsets of vertices corresponding to the extreme points of the upper convex hull of {(|S|, w(S)) | S subseteq V }, which we refer to as the dense frontier points. We also propose a flow-based combinatorial exact algorithm for unweighted graphs that runs in O(n^3) time. Finally, we propose a nearly-linear-time 3-approximation algorithm
Stochastic Solutions for Dense Subgraph Discovery in Multilayer Networks
Network analysis has played a key role in knowledge discovery and data
mining. In many real-world applications in recent years, we are interested in
mining multilayer networks, where we have a number of edge sets called layers,
which encode different types of connections and/or time-dependent connections
over the same set of vertices. Among many network analysis techniques, dense
subgraph discovery, aiming to find a dense component in a network, is an
essential primitive with a variety of applications in diverse domains. In this
paper, we introduce a novel optimization model for dense subgraph discovery in
multilayer networks. Our model aims to find a stochastic solution, i.e., a
probability distribution over the family of vertex subsets, rather than a
single vertex subset, whereas it can also be used for obtaining a single vertex
subset. For our model, we design an LP-based polynomial-time exact algorithm.
Moreover, to handle large-scale networks, we also devise a simple, scalable
preprocessing algorithm, which often reduces the size of the input networks
significantly and results in a substantial speed-up. Computational experiments
demonstrate the validity of our model and the effectiveness of our algorithms.Comment: Accepted to WSDM 202
AUTOMATIC NUMERICAL ELEMENT GENERATION BY BOUNDARY-FITTED CURVILINEAR COORDINATE SYSTEM
A method for dividing a two-dimensional multi-connected region of a complex shape into
a set of triangular elements is developed. A region of a complex shape in the physical
plane is divided into some simply connected subregions, and each subregion is mapped
onto a square region in the transformed plane. The inverse functions of the mapping
are determined by the solution to elliptic partial differential equations with the Dirichlet
boundary conditions. After the square region is divided into a set of finite elements,
each element is inversely mapped onto the subregions by use of the functions. The finite
element data for the global region are made of those for the divided subregions
- β¦