64,696 research outputs found
Towards Maximum Independent Sets on Massive Graphs
Maximum independent set (MIS) is a fundamental problem in graph theory and it has important applications in many areas such as social network analysis, graphical information systems and coding theory. The problem is NP-hard, and there has been numerous studies on its approximate solutions. While successful to a certain degree, the existing methods require memory space at least linear in the size of the input graph. This has become a serious concern in view of the massive volume of today’s fast-growing graphs. In this paper, we study the MIS problem under the semi-external setting, which assumes that the main memory can accommodate all vertices of the graph but not all edges. We present a greedy algorithm and a general vertex-swap framework, which swaps vertices to incrementally increase the size of independent sets. Our solutions require only few sequential scans of graphs on the disk file, thus enabling in-memory computation without costly random disk accesses. Experiments on large-scale datasets show that our solutions are able to compute a large independent set for a massive graph with 59 million vertices and 151 million edges using a commodity machine, with a memory cost of 469MB and a time cost of three minutes.Graph, maximum independent setPeer reviewe
A Tutorial on Clique Problems in Communications and Signal Processing
Since its first use by Euler on the problem of the seven bridges of
K\"onigsberg, graph theory has shown excellent abilities in solving and
unveiling the properties of multiple discrete optimization problems. The study
of the structure of some integer programs reveals equivalence with graph theory
problems making a large body of the literature readily available for solving
and characterizing the complexity of these problems. This tutorial presents a
framework for utilizing a particular graph theory problem, known as the clique
problem, for solving communications and signal processing problems. In
particular, the paper aims to illustrate the structural properties of integer
programs that can be formulated as clique problems through multiple examples in
communications and signal processing. To that end, the first part of the
tutorial provides various optimal and heuristic solutions for the maximum
clique, maximum weight clique, and -clique problems. The tutorial, further,
illustrates the use of the clique formulation through numerous contemporary
examples in communications and signal processing, mainly in maximum access for
non-orthogonal multiple access networks, throughput maximization using index
and instantly decodable network coding, collision-free radio frequency
identification networks, and resource allocation in cloud-radio access
networks. Finally, the tutorial sheds light on the recent advances of such
applications, and provides technical insights on ways of dealing with mixed
discrete-continuous optimization problems
Triadic Measures on Graphs: The Power of Wedge Sampling
Graphs are used to model interactions in a variety of contexts, and there is
a growing need to quickly assess the structure of a graph. Some of the most
useful graph metrics, especially those measuring social cohesion, are based on
triangles. Despite the importance of these triadic measures, associated
algorithms can be extremely expensive. We propose a new method based on wedge
sampling. This versatile technique allows for the fast and accurate
approximation of all current variants of clustering coefficients and enables
rapid uniform sampling of the triangles of a graph. Our methods come with
provable and practical time-approximation tradeoffs for all computations. We
provide extensive results that show our methods are orders of magnitude faster
than the state-of-the-art, while providing nearly the accuracy of full
enumeration. Our results will enable more wide-scale adoption of triadic
measures for analysis of extremely large graphs, as demonstrated on several
real-world examples
Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs
As massive graphs become more prevalent, there is a rapidly growing need for
scalable algorithms that solve classical graph problems, such as maximum
matching and minimum vertex cover, on large datasets. For massive inputs,
several different computational models have been introduced, including the
streaming model, the distributed communication model, and the massively
parallel computation (MPC) model that is a common abstraction of
MapReduce-style computation. In each model, algorithms are analyzed in terms of
resources such as space used or rounds of communication needed, in addition to
the more traditional approximation ratio.
In this paper, we give a single unified approach that yields better
approximation algorithms for matching and vertex cover in all these models. The
highlights include:
* The first one pass, significantly-better-than-2-approximation for matching
in random arrival streams that uses subquadratic space, namely a
-approximation streaming algorithm that uses space
for constant .
* The first 2-round, better-than-2-approximation for matching in the MPC
model that uses subquadratic space per machine, namely a
-approximation algorithm with memory per
machine for constant .
By building on our unified approach, we further develop parallel algorithms
in the MPC model that give a -approximation to matching and an
-approximation to vertex cover in only MPC rounds and
memory per machine. These results settle multiple open
questions posed in the recent paper of Czumaj~et.al. [STOC 2018]
- …