8 research outputs found
I/O efficient Core Graph Decomposition at web scale.
Core decomposition is a fundamental graph problem with a large number of
applications. Most existing approaches for core decomposition assume that the
graph is kept in memory of a machine. Nevertheless, many real-world graphs are
big and may not reside in memory. In the literature, there is only one work for
I/O efficient core decomposition that avoids loading the whole graph in memory.
However, this approach is not scalable to handle big graphs because it cannot
bound the memory size and may load most parts of the graph in memory. In
addition, this approach can hardly handle graph updates. In this paper, we
study I/O efficient core decomposition following a semi-external model, which
only allows node information to be loaded in memory. This model works well in
many web-scale graphs. We propose a semi-external algorithm and two optimized
algorithms for I/O efficient core decomposition using very simple structures
and data access model. To handle dynamic graph updates, we show that our
algorithm can be naturally extended to handle edge deletion. We also propose an
I/O efficient core maintenance algorithm to handle edge insertion, and an
improved algorithm to further reduce I/O and CPU cost by investigating some new
graph properties. We conduct extensive experiments on 12 real large graphs. Our
optimal algorithm significantly outperform the existing I/O efficient algorithm
in terms of both processing time and memory consumption. In many
memory-resident graphs, our algorithms for both core decomposition and
maintenance can even outperform the in-memory algorithm due to the simple
structures and data access model used. Our algorithms are very scalable to
handle web-scale graphs. As an example, we are the first to handle a web graph
with 978.5 million nodes and 42.6 billion edges using less than 4.2 GB memory
A Fast Order-Based Approach for Core Maintenance
Graphs have been widely used in many applications such as social networks,
collaboration networks, and biological networks. One important graph analytics
is to explore cohesive subgraphs in a large graph. Among several cohesive
subgraphs studied, k-core is one that can be computed in linear time for a
static graph. Since graphs are evolving in real applications, in this paper, we
study core maintenance which is to reduce the computational cost to compute
k-cores for a graph when graphs are updated from time to time dynamically. We
identify drawbacks of the existing efficient algorithm, which needs a large
search space to find the vertices that need to be updated, and has high
overhead to maintain the index built, when a graph is updated. We propose a new
order-based approach to maintain an order, called k-order, among vertices,
while a graph is updated. Our new algorithm can significantly outperform the
state-of-the-art algorithm up to 3 orders of magnitude for the 11 large real
graphs tested. We report our findings in this paper
K-Connected Cores Computation in Large Dual Networks
© 2018, The Author(s). Computing k- cores is a fundamental and important graph problem, which can be applied in many areas, such as community detection, network visualization, and network topology analysis. Due to the complex relationship between different entities, dual graph widely exists in the applications. A dual graph contains a physical graph and a conceptual graph, both of which have the same vertex set. Given that there exist no previous studies on the k- core in dual graphs, we formulate a k-connected core (k- CCO) model in dual graphs. A k- CCO is a k- core in the conceptual graph, and also connected in the physical graph. Given a dual graph and an integer k, we propose a polynomial time algorithm for computing all k- CCOs. We also propose three algorithms for computing all maximum-connected cores (MCCO), which are the existing k- CCOs such that a (k+ 1) -CCO does not exist. We further study a subgraph search problem, which is computing a k- CCO that contains a set of query vertices. We propose an index-based approach to efficiently answer the query for any given parameter k. We conduct extensive experiments on six real-world datasets and four synthetic datasets. The experimental results demonstrate the effectiveness and efficiency of our proposed algorithms
Core Decomposition in Multilayer Networks: Theory, Algorithms, and Applications
Multilayer networks are a powerful paradigm to model complex systems, where
multiple relations occur between the same entities. Despite the keen interest
in a variety of tasks, algorithms, and analyses in this type of network, the
problem of extracting dense subgraphs has remained largely unexplored so far.
In this work we study the problem of core decomposition of a multilayer
network. The multilayer context is much challenging as no total order exists
among multilayer cores; rather, they form a lattice whose size is exponential
in the number of layers. In this setting we devise three algorithms which
differ in the way they visit the core lattice and in their pruning techniques.
We then move a step forward and study the problem of extracting the
inner-most (also known as maximal) cores, i.e., the cores that are not
dominated by any other core in terms of their core index in all the layers.
Inner-most cores are typically orders of magnitude less than all the cores.
Motivated by this, we devise an algorithm that effectively exploits the
maximality property and extracts inner-most cores directly, without first
computing a complete decomposition.
Finally, we showcase the multilayer core-decomposition tool in a variety of
scenarios and problems. We start by considering the problem of densest-subgraph
extraction in multilayer networks. We introduce a definition of multilayer
densest subgraph that trades-off between high density and number of layers in
which the high density holds, and exploit multilayer core decomposition to
approximate this problem with quality guarantees. As further applications, we
show how to utilize multilayer core decomposition to speed-up the extraction of
frequent cross-graph quasi-cliques and to generalize the community-search
problem to the multilayer setting