88,001 research outputs found
I/O efficient Core Graph Decomposition at web scale.
Core decomposition is a fundamental graph problem with a large number of
applications. Most existing approaches for core decomposition assume that the
graph is kept in memory of a machine. Nevertheless, many real-world graphs are
big and may not reside in memory. In the literature, there is only one work for
I/O efficient core decomposition that avoids loading the whole graph in memory.
However, this approach is not scalable to handle big graphs because it cannot
bound the memory size and may load most parts of the graph in memory. In
addition, this approach can hardly handle graph updates. In this paper, we
study I/O efficient core decomposition following a semi-external model, which
only allows node information to be loaded in memory. This model works well in
many web-scale graphs. We propose a semi-external algorithm and two optimized
algorithms for I/O efficient core decomposition using very simple structures
and data access model. To handle dynamic graph updates, we show that our
algorithm can be naturally extended to handle edge deletion. We also propose an
I/O efficient core maintenance algorithm to handle edge insertion, and an
improved algorithm to further reduce I/O and CPU cost by investigating some new
graph properties. We conduct extensive experiments on 12 real large graphs. Our
optimal algorithm significantly outperform the existing I/O efficient algorithm
in terms of both processing time and memory consumption. In many
memory-resident graphs, our algorithms for both core decomposition and
maintenance can even outperform the in-memory algorithm due to the simple
structures and data access model used. Our algorithms are very scalable to
handle web-scale graphs. As an example, we are the first to handle a web graph
with 978.5 million nodes and 42.6 billion edges using less than 4.2 GB memory
Parallel Order-Based Core Maintenance in Dynamic Graphs
The core numbers of vertices in a graph are one of the most well-studied
cohesive subgraph models because of the linear running time. In practice, many
data graphs are dynamic graphs that are continuously changing by inserting or
removing edges. The core numbers are updated in dynamic graphs with edge
insertions and deletions, which is called core maintenance. When a burst of a
large number of inserted or removed edges come in, we have to handle these
edges on time to keep up with the data stream. There are two main sequential
algorithms for core maintenance, \textsc{Traversal} and \textsc{Order}. It is
proved that the \textsc{Order} algorithm significantly outperforms the
\alg{Traversal} algorithm over all tested graphs with up to 2,083 times
speedups.
To the best of our knowledge, all existing parallel approaches are based on
the \alg{Traversal} algorithm; also, their parallelism exists only for affected
vertices with different core numbers, which will reduce to sequential when all
vertices have the same core numbers. In this paper, we propose a new parallel
core maintenance algorithm based on the \alg{Order} algorithm. Importantly, our
new approach always has parallelism, even for the graphs where all vertices
have the same core numbers. Extensive experiments are conducted over
real-world, temporal, and synthetic graphs on a 64-core machine. The results
show that for inserting and removing 100,000 edges using 16-worker, our method
achieves up to 289x and 10x times speedups compared with the most efficient
existing method, respectively.Comment: Published on 52nd International Conference on Parallel Processing
(ICPP 2023), 17 pages, 7 figures, 2 table
A Fast Order-Based Approach for Core Maintenance
Graphs have been widely used in many applications such as social networks,
collaboration networks, and biological networks. One important graph analytics
is to explore cohesive subgraphs in a large graph. Among several cohesive
subgraphs studied, k-core is one that can be computed in linear time for a
static graph. Since graphs are evolving in real applications, in this paper, we
study core maintenance which is to reduce the computational cost to compute
k-cores for a graph when graphs are updated from time to time dynamically. We
identify drawbacks of the existing efficient algorithm, which needs a large
search space to find the vertices that need to be updated, and has high
overhead to maintain the index built, when a graph is updated. We propose a new
order-based approach to maintain an order, called k-order, among vertices,
while a graph is updated. Our new algorithm can significantly outperform the
state-of-the-art algorithm up to 3 orders of magnitude for the 11 large real
graphs tested. We report our findings in this paper
Storage and Search in Dynamic Peer-to-Peer Networks
We study robust and efficient distributed algorithms for searching, storing,
and maintaining data in dynamic Peer-to-Peer (P2P) networks. P2P networks are
highly dynamic networks that experience heavy node churn (i.e., nodes join and
leave the network continuously over time). Our goal is to guarantee, despite
high node churn rate, that a large number of nodes in the network can store,
retrieve, and maintain a large number of data items. Our main contributions are
fast randomized distributed algorithms that guarantee the above with high
probability (whp) even under high adversarial churn:
1. A randomized distributed search algorithm that (whp) guarantees that
searches from as many as nodes ( is the stable network size)
succeed in -rounds despite churn, for
any small constant , per round. We assume that the churn is
controlled by an oblivious adversary (that has complete knowledge and control
of what nodes join and leave and at what time, but is oblivious to the random
choices made by the algorithm).
2. A storage and maintenance algorithm that guarantees (whp) data items can
be efficiently stored (with only copies of each data item)
and maintained in a dynamic P2P network with churn rate up to
per round. Our search algorithm together with our
storage and maintenance algorithm guarantees that as many as nodes
can efficiently store, maintain, and search even under churn per round. Our algorithms require only polylogarithmic in bits to
be processed and sent (per round) by each node.
To the best of our knowledge, our algorithms are the first-known,
fully-distributed storage and search algorithms that provably work under highly
dynamic settings (i.e., high churn rates per step).Comment: to appear at SPAA 201
Incremental Maintenance of Maximal Cliques in a Dynamic Graph
We consider the maintenance of the set of all maximal cliques in a dynamic
graph that is changing through the addition or deletion of edges. We present
nearly tight bounds on the magnitude of change in the set of maximal cliques,
as well as the first change-sensitive algorithms for clique maintenance, whose
runtime is proportional to the magnitude of the change in the set of maximal
cliques. We present experimental results showing these algorithms are efficient
in practice and are faster than prior work by two to three orders of magnitude.Comment: 18 pages, 8 figure
- …